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in the method of the present invention the homologous recombination 
preferably occurs via the recET mechanism, i.e. the homologous 
recombination is mediated by the gene products of the recE and the recT 
genes which are oreferably selected from the E.coli genes recE and recT or 
functionally related genes such as the phage A reda and red£ genes. 

The host cell suitable for the method of the present invention preferably is 
a bacterial cell, e.g. a gram-negative bacterial celL More preferably, the host 
cell is an enterobacterial cell, such as Salmonella, Klebsiella or Escherichia. 
Most preferably the host cell is an Escherichia coli cell. It should be noted, 
however, that the cloning method of the present invention is also suitable 
for eukaryotic cells, such as fungi, plant or animal cells. 

Preferably, the host cell used for homologous recombination and 
propagation of the cloned DNA can be any cell, e.g. a bacterial strain in 
which the products of the recE and recT, or reda and redft, genes are 
expressed. The host cell may comprise the recE and recT genes located on 
the host cell chromosome or on non-chromosomal DNA. preferably on a 
vector, e.g. a plasmid. in a preferred case, the RecE and RecT, or Reda and 
Redfb, gene products are expressed from two different regulatable 
promoters, -such as the arabinose-inducible BAD promoter or the. lac. 
promoter or from non-reguiatable promoters. Alternatively, the recE and 
recT, or reda and redfl, genes are expressed on a polycistronic mRNA from 
a single regulatable or non-regulatable promoter. Preferably the expression 
is controlled by regulatable promoters. 

Especially preferred is also an embodiment, wherein the recE or reda gene 
is expressed by a regulatable promoter. Thus, the recombinogenic potential 
of the system is only elicited when required 3nd, at other times, possible 
undesired recombination' reactions are limited. The recT or redfl gene, on 
the other hand, is preferably overexpresscd with respect to recE or reda. 
This may be accomplished by using a strong constitutive promoter, e.g. the 
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EM7 promoter and/or by using a higher copy number of r.ecT, or red(5, 
versus recE, cr reda, genes. 

For the purpose of the present invention any recE and recT genes are 
suitable insofar as they allow a homologous recombination of first and 
second DNA moiecuies with sufficient efficiency to give rise to 
recombination products in more than 1 in 1 0 9 cells transfected with DNA. 
The recE and recT genes may be derived from any bacteria! strain or from 
bacteriophages or may be mutants and variants thereof. Preferred are recE 
and recT genes which are derived from E.coli or from E.coii bacteriophages, 
such as the reda and redd genes from iambdoid phages, e.g. bacteriophage 
A. 

More preferably, the recE or reda gene is selected from a nucleic acid 
molecule comprising 

(a) the nucleic acid sequence from position 1320 (ATG) to 2159 (GAC) as 
depicted in Fig.7B, 

(b) the nucleic acid sequence from position 1320 f.ATG) to 1998(CGA) as 
depicted in Fig. 148, 

(c) a nucleic acid encoding the same polypeptide within the degeneracy of 
the genetic code and/or 

(d) a nucleic acid sequence which hybridizes under stringent conditions with 
the nucleic acid sequence from (a), (b) and/or (ch 

More preferably, the recT or redG> gene is selected from a nucleic acid 
molecule comprising 

(a) the nucleic acid sequence from position 21 55 (ATG) to 2961 (GAA) as 
depicted in Fig. 78, 

(b) the nucleic acid sequence from position 2086 (ATG) to 2863 (GCA) as 
depicted in Fig. 1 48. 

(c) a nucleic.acid encoding the same polypeptide within the degeneracy of 
the genetic code and/or 
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id) a nucieic acid sequence which hybridizes under stringent conditions with 
the nucleic acid sequences from (a), (b) and/or (c). 

:: should be noted that the present invention also encompasses mutants and 
variants of the given sequences, e.g. naturally occurring mutants and 
variants or mutants and variants obtained by genetic engineering. Further 
it should be noted that the recE gene depicted in Fig. 73 is an already 
truncated gene encoding amino acids 588-866 of the native protein. 
Mutants and variants preferably have a nucleotide sequence identity of at 
least 60%, preferably of at least 70% and more preferably of at least 80% 
of the recE and recT sequences depicted in Fig.7B and 1 38. and of the reda 
and rediJ sequences depicted in Fig.14B. 

According to the present invention hybridization under stringent conditions 
preferably is defined according to Sambrook et al. (1939), infra, and 
comprises a detectable hybridization signal after washing for 30 min in 0.1 
x SSC, 0.5% SDS at 55°C, preferably at 62°C and more preferably at 
53°C. 

in a preferred case the recE and recT genes are derived from the 
corresponding endogenous genes present in the E.coli K12 strain and its 
derivatives or from bacteriophages. In particular, strains that carry the sbcA 
mutation are sukable. Examples of such strains afe JC8679 and JC 9604 
(Gillen et a!. (1931), supra). Alternatively, the corresponding genes may 
aiso be obtained from other coliphages such as iambdoid phages or phage 
P22. 

The genotype of JC 8679 and JC 9604 is Sex (Hfr, F + , F-, or F') : F-.JC 
3579 comprises the mutations: recBC 21 , reeC 22, sbcA 23, thr-1 , ara-14, 
leu B 6, DE (gpt-proA) 62, lacY1, tsx-33, giuV44 (AS), galK2 (Oc), LAM-, 
his-60, relA 1 , rps L31 (strR). xyl A5. mti-1 . argE3 (Oc) and thi-1 . JC 9604 
comprises the same mutations and further the mutation recA 55. 
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Further, it should be noted that the recE and reel", or reda and red£, genes 
can be isolated from a first donor source, e.g. a donor bacterial cell and 
transformed into a second receptor source, e.g. a receptor bacterial or 
eukaryotic ceil in which they are expressed by recombinant DNA means. 

In one embodiment of the invention,, the host cell used is a bacteriaistrain 
having an sbcA mutation, e.g. one of E.coh strains JC S379 and JC 9604 
mentioned above. However, the method of the invention is not limited to 
host cells having an sbcA mutation or analogous cells. Surprisingly, it has 
been found that the cloning method of the invention aiso works in cells 
without sbcA mutation, whether recBC -f or recBC-, e.g. also in prokaryotic 
recBC - host cells, e.g. in E.coli recBC -f cells. In that case preferably those 
host ceils are used in which the product of a recBC type exonuciease 
inhibitor gene is expressed. Preferably, the exonuciease inhibitor is capable 
of inhibiting the host recBC system or an equivalent thereof. A suitable 
example of such exonuciease inhibitor gene is the A reay gene (Murphy, 
J.BacterioL 173 (1991), 5808-5821) and functional equivalents thereof, 
respectively, which, for example, can be obtained from other coliphages 
such as from phage P22 (Murphy, J . BioLChem. 269 (1 994), 22507-225 1 5) . 

More preferably, the exonuciease inhibitor gene is selected from a nucleic 
acid molecule comprising : 

(a) the nucleic acid sequence from position 3588 ^ATG) to 4002 (GTA) as 
depicted in Fig.14A, 

(b) a nucleic acid encoding the same polypeptide within the degeneracy of 
the genetic code and/or 

(c) a nucleic acid sequence which hybridizes under stringent conditions (as 
defined above) with the nucleic acid sequence from (a) and/ or (b). 

Surprisingly, it has been found that the expression of an exonuciease 
inhibitor gene in both recBC -f- and rec.BC- strains leads to significant 
improvement of cloning efficiency. 
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The cloning method according to the present invention employs a 
homologous recombination between a first DNA molecule and a second 
DNA molecule. The first DNA molecule can be any DNA molecule that- 
carries 3n origin of replication which is operative in the host cell, e.g. an 
E.coli replication origin. Further, the first DNA molecule is present in a form 
which is capable of being replicated in the host cell. The first DNA 
molecule, i.e. the vector, can be any extrachromosomai DNA molecule 
containing an origin of replication which is operative in said host cell, e.g. 
a plasmid including single, low, medium or high copy plasmids or other 
extrachromosomai circular DNA molecules based on cosmid, PI, BAC or 
PAC vector technology. Examples of such vectors are described, for 
example, by Sambrook et al. {Molecular Cloning, Laboratory Manual, 2nd 
Edition (1989), Cold Spring Harbor Laboratory Press) and loannou et aL 
(Nature Genet. 6 (1 994), 84-89) or references cited therein. The first DNA 
molecule can also be a host cell chromosome, particularly the E.coli 
chromosome. Preferably, the first DNA molecule is a double-stranded DNA 
molecule. 

The second DNA molecule is preferably a linear DNA molecule and 
comprises at least two regions of sequence homology, preferably of 
sequence identity to regions on the first DNA molecule. These homology or 
identity regions are preferably at least 1 5 nucleotides each, more preferably 
at least 20 nucleotides and, most preferably, at least 30 nucleotides each. 
Especially good results were obtained when using sequence homology 
regions having a length of about 40 or more nucleotides, e.g. 50 or more 
nucleotides. The two sequence homology regions can be located on the 
linear DNA fragment so that one is at one end and the other is at the other 
end, however they may also be located internally. Preferably, also the 
second DNA molecule is a double-stranded DNA molecule. 

The two sequence homology regions are chosen according to the 
experimental design. There are no limitations on which regions of the first 
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DNA molecule can be chosen for the two sequence homology regions 
located on the second DNA molecule, except that the homologous 
recombination event cannot delete the origin of replication of the first DNA 
molecule. The sequence homology regions can be interrupted by non- 
identical sequence regions 35 . long as sufficient sequence homology is 
retained for the hnmolnnnns rocnmmn ? »in^ ,->. . _.. 

3 ! -V.UIMUIM011U11 icauuun. oy Uiiiiy bBQuence 

homology arms having non-identical sequence regions compared to the 
target site mutations such as substitutions, e.g. point mutations, insertions 
and/or deletions may be introduced into the target site by ET cloning. 

The second foreign DNA molecule which is to be cloned in the bacterial cell 
may be derived from any source. For example, the second DNA molecule 
may be synthesized by a nucleic acid amplification reaction such as a PCR 
where both of the DNA oligonucleotides used to prime the amplification 
contain in addition to sequences at the 3'-ends that serve as a primer for 
the amplification, one or the other of the two homology regions. Using 
oligonucleotides of this design, the DNA product of the amplification can be 
any DNA sequence suitable for amplification and will additionally have a 
sequence homology region at each end. 

A specific example of the generation of the second DNA molecule is the 
amplification of a gene that serves to convey a phenotypic difference to the 
bacterial host cells, in particular, antibiotic resistan-ce. A simple variation of 
this procedure involves the use of oligonucleotides that include other 
sequences in addition to the PCR primer sequence and the sequence 
homology region. A further simple variation is the use of more than two 
amplification primers to generate the amplification product. A further simple 
variation is the use of more than one amplification reaction to generate the 
amplification product. A further variation is the use of DNA fragments 
obtained by methods other than PCR, for example, by endonuclease or 
restriction enzyme cleavage to linearize fragments from any source of DNA. 
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It should be noted that the second DNA molecule is not necessarily a single 
species of DNA molecule. !t is of course possible to use a heterogenous 
population of second DNA molecules, e.g. to generate a DNA library, such 
as a genomic or cDNA library. 

The method of the present invention may comprise the contacting of the 
first ar>d second DNA molecules in vivo. In one embodiment of the present 
invention the second DNA fragment is transformed into a bacterial strain 
that already harbors the first vector DNA molecule. In a different 
embodiment, the second DNA molecule and the first DNA molecule are 
mixed togetner in vitro before co-transformation in the bacterial host cell. 
These two embodiments of the present invention are schematically depicted 
in Fig.1 . The method of transformation can be any method known in the art 
(e.g. Sambrook et al. supra). The preferred method of transformation or co- 
transformation, however, is electroporation. 

After contacting the first and second DNA molecules under conditions 
which favour homologous recombination between first and second DNA 
molecules via the ET cloning mechanism a host cell is selected, in which 
20 homologous recombination between said first and second DNA moiecules 
has occurred. This selection procedure can be carried out by several 
different methods. In the following three preferred selection methods are 
depicted in Fig. 2 and described in detail below. 

25 In a first selection method a second DNA fragment is employed which 
carries a gene for a marker placed between the two regions of sequence 
homology wherein homologous recombination is detectable by expression 
of the marker gene. The marker gene may be a gene for a phenotypic 
marker which is not expressed in the host or from the first DNA molecule. 

30 Upon recombination by ET cloning, the change in phenotype of the host 
strain conveyed by the stable acquisition of the second DNA fragment 
identifies the ET cloning product. 
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In a preferred case, the phenotypic marker is a gene that conveys resistance 
to an antibiotic, in particular, genes that convey resistance to kanamycin, 
ampislicm, chloramphenicol, tetracyciin or any other substance that shows ■ 
bacteriocidal or bacteriostatic effects on the bacterial strain employed. 

A simple variation is the use of a gene that complements a deficiency 
uiu ua^icnai iiusi itidin cuipiuysu. ruf exampie, tne nosi 
strain may be mutated so that it is incapable of growth without a metabolic 
supplement. In the absence of this supplement, a gene on the second DNA 
fragment can complement the mutational defect thus permitting growth. 
Only those cells which contain the episome carrying the intended DNA 
rearrangement caused by the ET cloning step will grow. 

In another example, the host strain carries a phenotypic marker gene which 
is mutated so that one of its codons is a stop codon that truncates the open 
reading frame. Expression of the full length protein from this phenotypic 
marker gene requires the introduction of a suppressor :RNA gene which, 
once expressed, recognizes the stop codon and permits translation of the 
full open reading frame. The suppressor tRNA gene is introduced by the ET 
cloning step and successful recombinants identified by selection for, or 
identification of, the expression of the phenotypic marker gene. In these 
cases, only those cells which contain the intended DNA rearrangement' 
caused by the ET cloning step will grow. 

A further simple variation is the use of a reporter gene that conveys a 
readily detectable change in colony colour or morphoiogy. In a preferred 
case, the green fluorescence protein (GPP) can be used and colonies 
carrying the ET cloning product identified by the fluorescence emissions of 
GPP. In another preferred case, the lacZ gene can be used and colonies, 
carrying the ET cloning product identified by a blue colony colour when X- 
gal is added to the culture medium. 
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In. a second selection method the insertion of the second DNA fragment into 
the first DNA molecuie by ET cloning alters the expression of a marker 
present on the first DNA molecule. In this embodiment the first DNA 
molecule contains at ieast one marker gene between the two regions of 
sequence homology and homologous recombination may be detected by an 
altered expression, e.g. lack of expression of the marker gene. 

In a preferred application, the marker present on the first DNA molecule is 
a counter-selectable gene product, such as the sacB, ccdB or tetracycline- 
resistance genes. In these cases, bacterial cells that carry the first DNA 
molecule unmodified by the ET cloning step after transformation with the 
second DNA fragment, or co-transformation with the second DNA fragment 
and the firs: DNA molecule, are plated onto a medium so the expression of 
the counter-selectable marker conveys a toxic or bacteriostatic effect on the 
host. Only those bacterial cells which contain the first DNA molecule 
carrying the intended DNA rearrangement caused by the ET cloning step 
will grow. 

In another preferred application, the first DNA molecule carries a reporter 
gene that conveys a readily detectable change in colony colour or 
morphology. In a preferred case, the green-fluorescence protein {GFP} can 
be present on the first DNA molecule and colonies carrying the first DNA 
molecule with or without* the ET cloning product Can be distinguished by 
differences in the fluorescence emissions of GFP. In another preferred case, 
the lacZ gene can be present on the first DNA molecule and colonies 
carrying the first DNA molecule with or without the ET cloning product 
identified by a blue or white colony colour when X-gal is added to the 
culture medium. 

In a third selection method the integration of the second DNA fragment into 
the first" DNA molecule by ET cloning removes a target site for a site 
specific recombinase, termed here an RT (for recombinase target) present 
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on the first DNA-molecule between the two regions of sequence homology. 
A homologous recombination event may be detected by removal of the 

target site. 

In the aosence of the ET cloning product, the RT is available for use by the 
corresponding site specific recombinase. The difference between the 
presence or not cf this RT is the basis for selection of the ET cloning 
product. In the presence of this RT and the corresponding site specific 
recombinase, the site specific recombinase mediates recombination at this 
RT and changes the phenotype of the host so that it is either not able to 
grow or presents a readily observable phenotype. In the absence of this RT, 
the corresponding site specific recombinase is not able to mediate 
recombination. 

In a preferred case, the first DNA molecule to which the second DNA 
fragment is directed, contains two RTs, one of which is adjacent to, but not 
part of, an antibiotic resistance gene. The second DNA fragment is directed, 
by design, to remove this RT. Upon exposure to the corresponding site 
specific recombinase, those first DNA molecules that do not carry the ET 
cloning product will be subject to a site specific recombination reaction 
between the RTs that remove the antibiotic resistance gene and therefore 
the first DNA molecule fails to ' convey resistance to the corresponding 
antibiotic. Only those first DNA molecules that contain the ET cloning 
product, or have failed to be site specifically recombined for some other 
reason, will convey resistance to the antibiotic. 

In another preferred case, the RT to be removed by ET cloning of the 
second DNA fragment is adjacent to a gene that complements a deficiency 
present within the host strain employed. In another preferred case, the RT 
to be removed by ET cloning of the second DNA fragment is adjacent to a 
reporter gene that conveys a readily detectable change in colony colour or 
morphology. 
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ln another preferred case, the RT to be removed by ET cloning of the 
second DNA fragment is anywhere on a first episomai DNA molecule and 
the episome carries an origin of replication incompatible with survival of the 
bacteria! host cell if it is integrated into the host genome. In this case the 
host genome carries a second RT, which may or may not be a mutated RT 
so that the corresponding site specific recombinase can integrate the 
episome, via its RT, into the RT sited in the host genome. Other preferred 
RTs include RTs for site specific recombinases of the resolvase/transposase 
class. RTs include those described from existing examples of site specific 
recombination as well as natural or mutated variations thereof. 

The preferred stte specific recombinases include Cre, FLP, Kw or 3ny site 
specific recombinase of the integrase class. Other preferred site specific 
recombinases include site specific recombinases of the 
resolvase/transposase class. 

There are no limitations on the method of expression of the site specific 
recombinase in the host cell. In a preferred method, the expression of the 
site specific recombinase is regulated so that expression can be induced and 
quenched according to the optimisation of the ET cloning efficiency. In this 
case, the site specific recombinase gene can be either integrated into the 
host genome or carried on an episome. In another preferred case, the site 
specific recombinase is expressed from an episome that carries a 
conditional origin of replication so that it can be eliminated from the host 
cell. 

In another preferred case, at least two of the above three selection methods 
are combined. A particularly preferred case involves a two-step use of the 
first selection method above, foticwed by use of the second selection 
method. This combined use requires, most s.mpiy, that the DNA fragment 
to be cloned includes a gene, or genes that permits the identification, in the 
first step, of correct ET cloning products bv the acquisition of a phenotypic 
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change, in a second step, expression of the gene or genes introduced in the 
first step is altered so that a second round of ET cloning products can be 

identified. In a preferred example, the gene employed is the tetracycline 
resistance gene and the first step ET cloning products are identified by the 
acquisition of tetracycline resistance. In the second step, Joss of expression 
of the tetracycline gene is identified by loss of sensitivity to nickel chloride, 

. w — - ^> uny ^lmci ayciu nidL iuxic co me nosT ceil wnen tne 

tetracycline gene is expressed. This two-step procedure permits the 
identification of ET cloning products by first the integration of a gene that 
conveys a phenotypic change on the host, and second by the loss of a 
related phenotypic change, most simply by removal of some of the DNA 
sequences integrated in the first step. Thereby the genes used to identify 
ET cloning products can be inserted and then removed to leave ET cloning 
products that are free of these genes. 

In a further embodiment of the present invention the ET cloning may also 
be used for a recombination method comprising the steps of 

a) providing a source of RecE and RecT, or Redo and RedR, proteins, 

b) contacting a first DNA moiecule which is capable of being repiicated in 
a suitable host cell with a second DNA molecule comprising at least two 
regions of sequence homology to regions on the first DNA molecule, under 
conditions which favour homologous recombination between said first and 
second DNA molecules and 

c) selecting DNA molecules in which a homologous recombination between 
said first and second DNA molecules has occurred. 

The source of RecE and RecT, or Rede and Red£, proteins may be either 
purified or partially purified RecE and RecT, or Rede and Redfc, proteins or 
ceil extracts comprising RecE and RecT. or Reda and Redft, proteins. 

1 he homologous recombination event m this embodiment may occur in 
vitro, e.g. when providing a ceii extract containing further components 
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required for homologous recombination. The homologous recombination 
event, .however, may aiso occur in vivo, e.g. by introducing RecE and RecT, 
cr Rede and RedfS, proteins or the extract in a host ceil (which may be 
recET positive cr not, or redaS positive or not) and contacting the DNA 
5 moiecu f es in the host cell. When the recombination occurs in vitro the 
selection of DNA ' molecules may be accomplished by transforming the 
recombination mixture in a suitable host cell and selecting for positive 
clones as described above. When the recombination occurs in vivo the 
selection methods as described above may directly be applied. 

A further subject matter of the invention is the use of cells, preferably 
bacterial cells, most preferably, E.coii cells capable of expressing the recE 
and recT, or reda and redS, genes as a host cell for a cloning method 
involving homologous recombination. 

: 5 

Still a further subject matter of the invention is a vector system capable of 
expressing recE and recT, or reda and red&, genes in a host cell and its use 
for a cloning method involving homologous recombination. Preferably, the 
vector system is aiso capable of expressing an exonuclease inhibitor gene 

20 as de fined above, e.g. the ^ red y gene. The vector system may comprise at 
least one vector. The recE and recT, or reda and redfS, genes are preferably 
located on a single vector and more preferably under control of a 
regulatable promoter which may be the same for- both genes or a sjngle 
promoter for each gene. Especially preferred is a vector system which is 

25 capable of overexpressing the recT, or redfe, gene versus the recE. or reda, 
gene. 

Still a further subject matter of the invention is the use of a source of RecE 
and RecT, or Reda and Redil, proteins for a cloning method involving 
30 homologous recombination. 
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A still further subject matter of the invention is a reagent kit for cloning 
comprising 

(a) a host cell, "preferably a bacterial- host cell, 

(b) means of expressing recE and recT, or reda and red£, genes in 
saia host cell, e.g. comprising a vector system, and 

(c) a recipient cloning vehicle, e.g. a vector, capable of being 
repiicated in said ceil. 

On the one hand, the recipient cloning vehicle which corresponds to the 
first DNA moiecuie of the process of the invention can already be present 
in the bacterial cell. On the other hand, it can be present separated from the 
bacteria! cell. 

In a further embodiment the reagent kit comprises 

(a) a source for RecE and RecT, or Reda and Redd, proteins and 

(b) a recipient cloning vehicle capable of being propagated in a host cell and 

(c) optionally' a host cell suitable for propagating said recipient cloning 
vehicle. 

The reagent kit furthermore contains, preferably, means for expressing a 
site specific recombinase in said host ceil, in particular, when the recipient 
ET cloning product contains at least one site specific recombinase target 
site. Moreover, the reagent kit can also contain DNA molecules suitable for 
use as a source of linear DNA fragments used for ET cloning, preferably by 
serving as templates for PCR generation of the linear fragment, also as 
specifically designed DNA vectors from which the linear DNA fragment is 
released by restriction enzyme cleavage, or as prepared linear fragments 
included in the kit for use as positive controls or other tasks. Moreover, the 
reagent kit can also contain nucleic acid amplification primers comprising 
a region of homology to said vector. Preferably, this region of homology is 
located at the 5'-end of the nucleic acid amplification primer. 
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The invention is further illustrated by the following Sequence listings, 
Figures and Examples. 

SEQ ID NO. 1: shows the nucleic acid sequence of the plasmid 

pBAD24-rec ET (Fig. 7). 
SEQ ID NOs 2/3: show the nucleic acid and amino acid sequences of the 

truncated recE gene (t-recE) present on pBAD24-recET 

at positions 1 320-21 62. 
SEQ ID NOs 4/5: show the nucleic acid and amino acid sequences of the 

recT gene present on pBAD24-recET at position 21 55- 

2972. 

SEQ ID NOs 6/7: show the nucleic acid and amino acid sequences of the 

araC gene present on the complementary stand to the 
one shown of pBAD24-recET at positions 974-996. 

SEQ ID NOs 8/9: show the nucleic acid an amino acid sequences of the 

bla gene present on pBAD24-recET at positions 3493- 
4353. 

SEQ ID NO 10: shows the nucleic acid sequence of the plasmid pBAD- 

ETy (Fig. 13). 

SEQ ID No 1 1 : shows the nucleic acid sequence of the piasmid pBAD- 

ody (Fig. 14) as well as the coding regions for the 
genes reda (1320-200), redft (2086-2371) and redK 
(3403-3819). 

SEQ ID NOs 12-14: show the amino acid sequences of the Reda, 

Redfe and Redy proteins, respectively. The redK 
sequence is present on each of pBAD-ET^ (Fig. 
13) and pBAD-af^K (Fig. 14). 

Figure 1 

A preferred method for ET cloning is shown by diagram. The linear DNA 
fragment to be cloned is synthesized by PCR using oligonucleotide primers 
that contain a left homology arm chosen to match sequences in the 
recipient episome and a sequence for priming in the PCR reaction, and a 



right homology arm chosen to match another sequence in the recipient 
episome and a sequence for priming in the PCR reaction. The product of the 
PCR reaction, here a selectable marker gene (sm 1 ), is consequently flanked 
bv the left and right homology arms and can be mixed together in vitro with 
the episome before co-transformation, or transformed into a host cell 
harboring the target episome. The host c e! ! contains the products of the 
recE and reel genes. ET cloning products are identified by the combination 
of two selectable markers, sm1 and sm2 on the recipient episome. 

Figure 2 

Three ways to identify ET cloning products are depicted. The first, (on the 
left of the figure), shows the acquisition, by ET cloning, of a gene that 
conveys a phenotypic difference to the host, here a selectable marker gene 
(sm). The second (in the centre of the figure) shows the loss, by ET cloning, 
of a gene that conveys a phenotypic difference to the host, here a counter 
selectable marker gene (counter-sm). The third shows the loss of a target 
site (RT. shown as triangles on the circular episome) for a site specific 
recombinase (SSR), by ET cloning. In this case, the correct ET cloning 
product deletes one of the target sites required by the SSR to delete a 
selectable marker gene ism). The failure of the SSR to delete the sm gene 
identifies the correct ET cloning product. 

Figure 3 

A simple example of ET cloning is presented. 

(a) Top panel * PCR products (left lane) synthesized from oligonucleotides 
designed as described in Fig . 1 to amplify by PCR a kanamycin resistance 
gene and to be flanked by homology arms present in the recipient vector, 
were mixed in vitro with the recipient vector (2nd lane) and cotransformed 
into a recET-f- E.coli host. The recipient vector carried an ampillicin 
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resistance gene, (b) Transformation of the sbcA E.coli strain JC9604 with 
either the PCR product alone {0.2 ug) or the vector alone (0.3 jjg) did not 
convey res.stance to double selection with ampiciilin and kanamycin 
•amc -r kan;. however cotransformation of both the PCR .product and the 
ector proc-jced coub.e resistant colonies. More than 95 °c of'these colonies 
contained the correct ET cloning product where the kanamycin gene had 
precise.lv integrated into the recipient vector according to the choice of 
homology arms. The two lanes on the right of (a) show Pvu II restriction 
enzyme digestion of the recipient vector before and after ET cloning, (c) As 
for b, except that six PCR products <0.2//g each) were cotransformed with 
pSVpaZi 1 (0.3 vg each) into JC9604 and plated onto Amp + Kan plates or 
A mo plates. Results are plotted as Amp + Kan-resistant colonies, 
representing recombination products, divided by Amp-resistant colonies, 
representing the plasmid transformation efficiency of the competent cell 
preparation, x 10*. The PCR products were equivalent to the a-b PCR 
product except that homology arm lengths were varied. Results are from 
five experiments that used the same batches of competent cells and DNAs. 
Error bars represent standard deviation, (d) Eight products flanked by 50 bp 
homology arms were cotransformed with pSVpaZi 1 into JC9604. All eight 
PCR products contained the same left homology arm and amplified neo 
gene. The right homology arms were chosen from the pSVpaZi 1 sequence 
to be adjacent to (0). or at increasing distances (7-3100 bp), from the left. 
Results are from four experiments. 



Figure 4 

ET cloning in an approximately 1 OOkb P1 vector to exchange the selectable 
marker. 

A PI clone which uses a kanamycin res.stance gene as selectable marker 
and which contains at least 7.0 kb of the mouse Hox a gene cluster was 
used. Before ET cloning, this ep.some conveys kanamycin resistance (top 
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panel, upper left) to its host E.coli which are ampillicin sensitive (top panel, 
upper right). A linear DNA fragment designed to replace the kanamycin 
resistance gene with an ampillicin resistance gene was made by PCR as 
outlined in Fig. 1 and transformed into E.coli host cells in which the recipient 

Hox a/P 1 vecicr was resident. ET cloning resulted in the deletion of the 
kanamycin resistance gene, and restoration of kanamycin sensitivity (top 
pane!, lower left; and the acquisition of ampillicin resistance (top panel, 
lower rignt) . Precise DNA recombination was verified by restriction digestion 
and Southern blotting analyses of isolated DNA before and after ET cloning 
(lower panel). 

Figure 5 

ET cloning to remove a counter selectable marker 

A PCR fragment (upper panel, left, third lane) made as outlined in Figs. 1 
and 2 to contain the kanamycin resistance gene was directed by its chosen 
homology arms to delete the counter selectable ccdB gene present in the 
vector. pZero-2.1. The PCR product and the pZero vector were mixed in 
vitro (upper panel, left, 1 st lane) before cotransformation into a recE/recT -r 
E.coli host. Transformation of pZero-2.1 alone and plating onto kanamycin 
selection medium resulted in little colony growth (lower panel, left). 
Cotransformation of pZero-2.1 and the PCR product presented ET cloning 
products (lower panel, right) which showed the intended molecular event 
as visualized by Pvu il digestion (upper panel, right). 

Figure 6 

ET cloning mediated by inducible expression of recE and recT from an 
episome. 

RecE/RecT mediate homologous recombination between linear and circular 
DNA molecules, (a) The plasmid pBAD24-recET was transformed into E.coli 
JC5547, and then batches of competent cells were prepared after induction 
of RecE/RecT expression by addition of L-arabinose for the times indicated 
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before harvesting. A PCR product, made using oligonucleotides e and f to 
contain the chloramphenicol resistance gene (cm) of pMAK705 and 50 bp 
hGrnolo-gy arms chosen to fiank the ampicililin resistance gene (bla) of 
p5AD24-recET. was then transformed and recombinants identified on 
chloramphenicol plates, (b) Arabinose was added to cultures of p8AD24- 
recET transformed JC 5 547 for different times immediately before harvesting 
for competent ceil preparation. Total protein expression was analyzed. by 
SDS-PAGE and Coomassie blue staining, (c) The number of chloramphenicol 
resistant colonies per jjg of PCR product was normalized against a control 
for transformation efficiency, determined by including 5 pg pZero2.1, 
conveying kanamycin resistance, in the transformation and plating an 
aliQuot onto Kan plates. 

Figure 7A 

The plasmid pBAD24-recET is shown by diagram. The plasmid contains the 
genes recE fin a truncated form) and recT under control of the inducible 
BAD promoter (P= A0 ). The plasmid further contains an ampillicin resistance 
gene (Amp') and an araC gene. 

Figure 7B 

The nucleic acid sequence and the protein coding portions of pBAD24-recET 
are depicted. 

Figure 8 

Manipulation of a large E.coli episome by multiple recombination steps, a 
Scheme of the recombination reactions. A PI clone of the Mouse Hoxa 
complex, resident in JC9604, was modified by recombination with PCR 
products that contained the neo gene and two Flp recombination targets 
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(FRTs). The two PCR products were identical except that one was flanked 
by g and h homology arms (insertion), and the other was flanked by i and 
h homology arms (deletion). In a second step, the neo gene was removed 
t>y F!p recombination between the FRTs by transient transformation of a Flp 
expression plasmid based on the pSC101 temperature-sensitive origin (ts 
on), b Upper panel; ethidium bromide stained agarose gei showing EcoRl 
digestions of PI DN A preparations from three independent colonies for each 
step. Middle panel; a Southern blot of the upper panel hybridized with a neo 
gene probe. Lower panel: a Southern blot of the upper panel hybridized with 
a Hoxa3 probe to visualize the site of recombination. Lanes 1 , the original 
Hoxa3 PI clone grown in E.coli strain NS31 45. Lanes 2, replacement of the 
Tn903 kanamycin resistance gene resident in the P1 vector with an 
ampicillin resistance gene increased the 8.1 kb band (lanes 1), to 9.0 kb. 
Lanes 3, insertion of the Tn5-neo gene with g-h homology arms upstream 
of Hoxa3. increased the 6.7 kb band (lanes 1,2) to 9.0 kb. Lanes 4, Flp 
recombinase deleted the g-h neo gene reducing the 9.0 kb band (lanes 3) 
back to 6.7 kb. Lanes 5, deletion of 6 kb of Hoxa3 - 4 intergenic DNA by 
replacement with the t-h neo gene, decreased the 6.7 kb band (lanes 2) to 
4.5 kb. Lanes 5. Flp recombinase deleted the i-h neo gene reducing the 4.5 
kb band to 2.3 kb. 

Figure 9 

Manipulation of the E.coli chromosome. A Scheme of the recombination 
reactions. The endogenous lacZ gene of JC9604 at 7.3' of the E.coli 
chromosome, shown in expanded form with relevant Ava I sites and 
coordinates, was targeted by a PCR fragment that contained the neo gene 
flanked by homology arms j and k, and loxP sites, as depicted. Integration 
of the neo gene removed most of the lacZ gene including an Ava I site to 
alter the 1443 and 3027 bp bands into a 3277 bp band. In a second step, 
the neo gene was removed by Cre recombination between the loxPs by 
transient transformation of a Cre expression plasmid based on the pSC101 
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temperature-sensitive origin (ts ori). Removal of the neo gene by Cre 
recombinase reduces the 3277 band to 2111 bp. b &-galactosidase 
expression evaluated by streaking colonies on X-Gal plates. The top row of 
three streaks show R>-gaiactostdase expression in. the host JC9604 strain 
iw.:.). the tower t.nree rows (Km) show 24 independent primary colonies. 
20 of which display a loss of fi-gaiactosidase expression indicactive of the 
intended recombination event, c Southern analysis of E.coli chromosomal 
DNA digested with Ava I using a random primed probe made from the entire 
lacZ coding region; lanes 1,2, w.t.; lanes 3-6, four independent white 
colonies after integration of the j-k neo gene; lanes 7-10; the same four 
colonies after transient transformation with the Cre expression plasmid. 

Figure 10 

Two rounds of ET cloning to introduce a point mutation, a Scheme of the 
recombination reactions. The lacZ gene of pSVpaXI was disrupted in 
JC9604lacZ. a strain made by the experiment of Fig. 9 to ablate endogenous 
lacZ expression and remove competitive sequences, by a sacB-neo gene 
cassette, synthesized by PCR to plB279 and flanked by i and m homology 
arms. The recombinants, termed pSV-sacB-neo, were selected on 
Amp + Kan plates. The lacZ gene of pSV-sacB-neo was then repaired by a 
PCR fragment made from the intact lacZ gene using I" and m" homology 
arms. The m* homology arm included a silent C to G change that created 
a BamHI site. The recombinants, termed pSVpaXI*, were identified by 
counter selection against the sacB gene using 7% sucrose, b fS- 
galactosidase expression from pSVpaX 1 was disrupted in pSV-sacB-neo and 
restored in pSVpaXI". Expression was analyzed on X-gal plates. Three 
independent colonies of each pSV-sacB-neo and pSVpaXI" are shown, c 
Ethidium bromide stained agarose gels of BamHI digested DNA prepared 
from independent colonies taken a;;er counter selection with sucrose. All 
R>-galactosidase expressing colonies j blue* contained the introduced BamHI 
restriction site (upper panel). Ail white colonies displayed large 
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rearrangements and no product carried the diagnostic 1.5kb BamHl 
restriction fragment (lower panel). 

Figure 11 

i ransrerance or z ; cloning into a recBC 4- host to modify a iarge episome. 
a Scheme of the ofasmid, pBAD-ETj/, which carries the mobile ET system, 
and the strategy employed to target the Hoxa PI episome. pBAD-ETy is 
based on pBAD24 and includes (i) the truncated recE gene (t-recE) under 
the arabinose-inducible P 3A0 promoter; (ii) the recT gene under the EM7 
promoter; and mi) the red^ gene under the Tn5 promoter. It was 
transformed into NS3145. a recA E.coli strain which contained the Hoxa PI 
eoisome. After arabinose induction, competent cells were prepared and 
transformed with a PCR product carrying the chloramphenicol resistance 
gene (cm) flanked by n and p homology arms, n and p were chosen to 
recombine with a segment of the PI vector, b Southern blots of Pvu II 
digested DNAs hybridized with a probe made from the PI vector to visualize 
the recombination target site (upper panel) and a probe made from the 
chlorampnenicoi resistance gene (lower panel). Lane 1 , DNA prepared from 
ceils harboring the Hoxa PI episome before ET cloning. Lanes. 2-1 7, DNA 
prepared from 16 independent chloramphenicol resistant colonies. 

Figure 12 

Comparison of ET cloning using the recE/recT genes in pBAD-ETK with 
reda/redft genes in pBAD-aR>K. 

The plasmids pBAD-ETy or pBAD-a&y . depicted, were transformed into the 
E.coli recA-. recBC+ strain, DK 1 and targeted by a chloramphenicol gene 
as described in Fig. 6 to evaluate ET cioning efficiencies. Arabinose 
induction of protein expression was for 1 hour. 
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Figure 1 3 A 

The plasmid pEAD-ET;/ is shown by diagram. 
z Figure 13B 

The nucieic acid sequence and the protein coding portions of pBAD-ETy are 
depicted. 

io Figure 14A 

The plasmid pBAD-afS;/ is shown- by diagram. This plasmid substantially 
corresponds to the plasmid shown in Fig. 1 3 except that the recE and recT 
genes are substituted by the reda and redfi genes. 

15 

Figure 14B 

The nucleic acid sequence and the protein coding portions of pBAD-a(Sy are 
depicted. 

1 . Methods 

1.1. Preparation of linear fragments 

Standard PCR reaction conditions were used to amplify linear DNA 
25 fragments. The sequences of the primers used are depicted in Table 1. 

Table 1 

so The Tn5-neo gene from pJP5603 (Penfoid and Pemberton, Gene 113 
(1992), 145-146) was amplified by using oiigo pairs a/b and c/d. The 
chloramphenicol (cm) resistant gene from pMAK705 (Hashimoto-Gotoh and 
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Sekiguchi, J. Bacterid. 131 ( 1 97 7), 405-41 2) was amplified by using primer 
pairs e/f and n/p. The Tn5-neo gene flanked by FRT or loxP sites was 
amplified from pKaZ or pKaX (http://www.embl-heidelberg.de/Externallnfo 
/Stewart) using oiigo pairs i/h, g/h and j/k. The sacB-neo cassette from 
p!S273 (Biomfield eta!.. Mol. Microbiol. 5 (1991), 1447-1457) was amplified 
by using ciigo pair i/m. The lacZ gene fragment from pSVpaZI 1 (Buchholz 
et al., Nucleic Acids Res. 24 (1 996), 4256-4262) was amplified using oligo 
pair f/m\ PCR products were purified using the QIAGEN PCR Purification 
Kit and eluted with H 2 0 2 , followed by digestion of any residual template 
DNA with Dpn I. After digestion, PCR products were extracted once with 
PhenohCHCU, ethanol precipitated and resusp ended inH.Oat approximately 
0.5 ^g//yl. 

1.2 Preparation of competent cells and electroporation 

Saturated overnight cultures were diluted 50 fold into LB medium, grown 
to an OD600 of 0.5, following by chilling on ice for 15 min. Bacterial cells 
were centrifuged at 7.000 rpm for 10 min at 0°C. The pellet was 
resuspended in ice-cold 10% glycerol and centrifuged again (7,000 rpm, 
-5°C, 10 min). This was repeated twice more and the cell pellet was 
suspended in an equal volume of ice-cold 10% glycerol. Aliquots of 50 jj\ 
were frozen in liquid nitrogen and stored at -80°C. Cells were thawed on 
ice and 1 /vl DNA solution (containing, for co-transformation, 0.3 jjg plasmid 
and 0.2 jjg PCR products; or, for transformation, 0.2 /jg PCR products) was 
added. Electroporation was performed using ice-cold cuvettes and a Eio-Rad 
Gene Pulser set to 25 /vFD, 2.3 kV with Pulse Controller set at 200 ohms. 
LB medium ( 1 ml) was added after eiectroporatton. The cells were incubated 
at 37°C for 1 hour with shaking and then spread on antibiotic plates. 



1.3 Induction of RecE and RecT expression 
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E.coii JC 5 547 carrying p3AD24-recET was cultured overnight in LB medium 
plus 0.2% glucose. 100 ug/m\ ampicillin. Five parallel LB cultures, one of 

which (0? included 0.2 % glucose, were started by a 17100 inoculation. The 
cultures were incubated at 37°C with shaking for 4 hours and 0.1% L- 
arabinose was adced 3, 2, 1 or 1/2 hour before harvesting and processing 
as above. Immediately before harvesting, 1 00 /j\ was removed for analysis 
on a 10°: SDS-polyacryiamide gel. E.coii NS3145 carrying Hoxa-PI and 
pBAD-ETk was induced by 0.1 % L-arabinose for 90 rnin before harvesting. 

1 .4 Transient transformation of FLP and Cre expression plasmids 

The FLP and Cre expression plasmids, 705-Cre and 705-FLP (Buchholz et 
al. Nucleic Acids Res. 24 (1996), 31 13-3119), based on the pSC101 
temperature sensitive origin, were transformed into rubidium chloride 
competent bacterial cells. Cells were spread on 25 /vg/ml chloramphenicol 
plates, and grown for 2 days at 30°C, whereupon colonies were picked, 
replated on L-agar plates without any antibiotics and incubated at 40°C 
overnight. Single colonies were analyzed on various antibiotic plates and all 
showed the expected ioss of chloramphenicol and kanamycin resistance. 

1.5 Sucrose counter selection of sacB expression 

The E.coii JC9604lacZ strain, generated as described in Fig.11, was 
cotransformed with a sac3-neo PCR fragment and pSVpaXI (Buchholz et 
al, Nucleic Acids Res. 24 (1 996), 4256-4262). After selection on 100//g/ml 
ampicillin, 50 jug/ml kanamycin plates, p3VpaX-sacB-neo plasmids were 
isolated and cotransformed into fresh JC9604lacZ cells with a PCR 
fragment amplified from pSVpaXI using primers f/'m\ Oligo m* carried a 
silent point mutation which generated a BamHI site. Cells were plated on 
7% sucrose, 1 00 /yg/mi ampiciilin, 40 ml X-gal plates and incubated at 
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28 C C for 2 days. The blue and white colonies grown on sucrose plates 
were counted and further checked by restriction analysis. 

1 .6 Other methods 

DNA preparation and Southern analysis were performed according to 

- - — • ■ — — ■ ^.^w^-w^-..^^. i 'yuiiui^aiiuii y i uuco vucic uci ici gicu u y i ai iuuh < 

priming of fragments isolated from the Tn5 neo gene (Pvuli), Hoxa3 gene 
(both Hindlll fragments), lacZ genes (EcoRl and BamH1 fragments from 
pSVpaXl), cm gene (BstS1 fragments from pMAK705) and PI vector 
fragments (2.2 kb EcoRl fragments from P1 vector). 

2. Results 

2.1 Identification of recombination events in E.coli 

To identify a flexible homologous recombination reaction in E.coli/an assay 
based on recombination between linear and^circular DNAs was designed 
(Fig. 1 , Fig. 3;. Linear DNA carrying the Tn5 kanamycin resistance gene (neo) 
was- made by PGR (Fig. 3a). Initially, the oligonucleotides used for PCR 
amplification of neo were 60mers consisting of 42 nucleotides at their 5' 
ends identical to chosen regions in the plasmid and, at the 3' ends. 18 
nucleotides to serve as PCR primers. Linear and circular DNAs were mixed 
in equimoiar proportions and co-transformed into a variety of E.coli hosts. 
Homologous recombination was only detected in sbcA E.coli hosts. More 
than 95% of double ampicillin/kanamycin resistant colonies (Fig. 3b) 
contained the expected homologously recombined piasmtd as determined 
by restriction digestion and sequencing. Only a low background of 
kanamycin resistance, due to genomic integration of the neo gene, was 
apparent (not shown). 
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The linear plus circular recombination reaction was characterized in two 
ways. The relationship betweeen homology arm length and recombination 
efficiency was simple, with longer arms recombining more efficiently 
(Fig. 3c). Efficiency increased within the range tested, up to 60 bp. The 
effect of distance between the two chosen homology sites in the recipient 
plasmid was examined (Fig. 3d). A set of eight PGR fragments was 
generated by use of a constant left homology arm with differing right 
homology arms. The right homology arms were chosen from the plasmid 
sequence to be 0 - 3100 bp from the left. Correct products were readily 
obtained from all, with less than 4 fold difference between them, although 
the insertionai product (0) was least efficient. Correct products also 
depenaed on the presence of both homology arms, since PCR fragments 
containing only one arm failed to work. 

2.2 Involvement of RecE and RecT 

The relationship between host genotype and this homologous recombination 
reaction was more systematically examined using a panel of E.coli strains 
deficient in various recombination components (Table 2). 

Table 2 

Only the two sbcA strains, JC8679 and JC9604 presented the intended 
recombination products and RecA was not required. In sbcA strains, 
expression cf RecE and RecT is activated. Dependence on recE can be 
inferred from comparison of JC8679 with JC8691. Notably no 
recombination products were observed in JC9387 suggesting that the 
sbcBC background is not capable of supporting homologous recombination 
based on 50 nucleotide homology arms. 

To demonstrate that RecE and RecT are involved, part of the recET operon 
was cloned into an inducible expression vector to create pBAD24-recET 
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(Fig. 6a). tne recE gene was truncated at its N-terminal end, as the first 588 
a.a.s of RecE are dispensable. The recBC strain, JC5547, was transformed 
w i t n pSAD24~recETap. d a ti m e course of RecE/RecT Induction perfor m e d 
by adding arabinose to the culture media at various times before harvesting 
fc r competent cells. The batches of harvested competent cells were 
evaluated for protein expression by gel electrophoresis (Fig. 6b) and for 
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pBAD24-recET plasmid (Fig. 6c). Without induction of RecE/RecT, no 
recombinant products were found, whereas recombination increased in 
approximate concordance with increased RecE/RecT expression. This 
experiment also shows that co-transformation of linear and circular DNAs 
is not essential and the circular recipient can be endogenous in the host. 
From the results shown in Figs. 3, 6 and Table 2, we conclude that RecE 
and RecT mediate a very useful homologous recombination reaction in 
rec3C E.coli at workable frequencies. Since RecE and RecT are involved, we 
refer to this way of recombining linear and circular DNA fragments as "ET 
cloning". 

2.3 Application of ET cloning -to large target DNAs 

To show mat large DNA episomes could be manipulated in E.coli, a > 76 
kb P1 clone that contains at least 59 kb of the intact mouse Hoxa complex, 
(confirmed by DNA sequencing and Southern blotting), was transferred to 
an E.coli strain having an sbcA background (JC9604) and subjected to two 
rounds of ET cloning. In the first round, the Tn903 kanamycin resistance 
gene resident in the P1 vector was replaced by an ampicillin resistance gene 
(Fig. 4). In the second round, the interval between the Hoxa3 and a4 genes 
was targeted either by inserting the neo gene between two base pairs 
upstream of the Hoxa3 proximal promoter, or by deleting 6203 bp between 
the Hoxa3 and a4 genes (Fig. 8a). Both insertional and deletional ET cloning 
products were readily obtained (Fiq.Sb, lanes 2. 3 and 5) showing that the 
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two rounds of ET cloning took place in this large E.coli episome with 
precision and no apparent unintended recombination. 

The general applicability of ET cloning was further examined by targeting 
a gene in the E.coii chromosome (Fig. 9a). The fS-galactosidase (lacZ) gene 
of JC9604 was chosen so that the ratio between correct and incorrect 
recombinants could be determined ( by evaluating fj-galactosidase 
expression. Standard conditions (0.2 /vg PCR fragment; 50 jj\ competent 
ceils), produced 24 primary colonies, 20 of which were correct as 
determined by R>-galactosidase expression (Fig. 9b), and DNA analysis 
(Fig. 9c. lanes 3-6). 

2.4 Secondary recombination reactions to remove operational sequences 

The products of ET cloning as described above are limited by the necessary 
inclusion of selectable marker genes. Two different ways to use a further 
recombination step to remove this limitation were developed. In the first 
way, site specific recombination mediated by either Flp or Cre recombinase 
was employed. In the experiments of Figs. 8 and 9, either Fip recombination 
target sites (FRTs) or Cre recombination target sites (loxPs) were included 
to flank tne neo gene in the linear substrates. Recombination between the 
FRTs or loxPs was accomplished by Flp or Cre, respectively, expressed from 
plasmids with the pSCTOI temperature sensitive replication origin 
(Hashimoto-Gotoh and Sekiguchi, J.Bacteriol. 131 (1977), 405-412) to 
permit simple elimination of these plasmids after site specific recombination 
by temperature shift. The precisely recombined Hoxa PI vector was 
recovered after both ET and Flp recombination with no other recombination 
products apparent (Fig. 8, lanes 4 and 6). Similarly, Cre recombinase 
precisely recombined the targeted lacZ allele (Fig. 9, lanes 7-10). Thus site 
specific recombination can be readily coupled with ET cloning to remove 
operational sequences and leave a 34 bp site specific recombination target 
site at the point of DNA manipulation. 



In the second way to remove the selectable marker gene, two rounds of ET 
cloning, combining positive and counter selection steps, were used to leave 
the DNA product free of any operational sequences (Fig. 10a). 

Additionally this experiment was designed to evaluate, by a functional test 
based on ft-gaiactosidase activity, whether ET cloning promoted small 
mutations such as frame shift or point mutations within the region being 
manipulated. In the first round, the lacZ gene of pSVpaXI was disrupted 
with a 3.3 kb PCR fragment carrying the neo and B.subtilis sacB (Blomfield 
et aL, Mol. Microbiol. 5 (1991), 1447-1457) genes, by selection for 
kanamycin resistance (Fig. 10a). As shown above for other positively 
selected recombination products, virtually all selected colonies were white 
(Fig. 10b), indicative of successful lacZ disruption, and 1 7 of 1 7 were 
confirmed as correct recombinants by DNA analysis. In the second round, 
a 1 .5 kb PCR fragment designed to repair lacZ was introduced by counter 
selection against the sacB gene. Repair of lacZ included a silent point 
mutation to create a BamH1 restriction site. Approximately one quarter of 
sucrose resistant colonies expressed ft-gaiactosidase, and all analyzed (17 
of 17; Fig. 10c) carried the repaired lacZ gene with the BamH1 point 
mutation. The remaining three quarters of sucrose resistant colonies did not 
express fi-galactosidase, and all analyzed (17 of 17; Fig. 10c) had 
undergone a variety of large mutational events, none of which resembled 
the ET cloning product. Thus, in two rounds of ET cloning directed at the 
lacZ gene, no disturbances of S-galactosidase activity by small mutations 
were observed, indicating the RecE/RecT recombination works with high 
fidelity. The significant presence of incorrect products observed in the 
counter selection step is an inherent limitation of the use of counter 
selection, since any mutation that ablates expression of the counter 
selection gene will be selected. Notably, all incorrect products were large 
mutations and therefore easily distinguished from the correct ET product by 
DNA analysis. In a different experiment (Fig. 5), we observed that ET cloning 
into p Zero 2. 1 (InVitroGen) by counter selection against the ccdB gene gave 
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a lower background of incorrect products (3%), indicating that the counter 
selection background is variable according to parameters that differ from 
those that influence ET cloning efficiencies. 

2.5 Transference of ET cloning between E.cofi hosts 

The experiments shown above were performed in recBC- E.coli hosts since 
the sbcA mutation had been identified as a suppressor of recBC (Barbour 
et al., Proc. Natl. Acad. Sci. USA 67 (1970), 128-135; Clark, Genetics 78 
(1974), 259-271). However, many useful E.coli strains are recBC , 
including strains commonly used for propagation of P1, BAC or PAC 
episomes. To transfer ET cloning into recBC -f- strains, we developed pBAD- 
ETy and pBAD-afSy (Figs. 13 and 14). These plasmids incorporate three 
features important to the mobility of ET cloning. First, RecBC is the major 
E.coli exonuclease and degrades introduced linear fragments. Therefore the 
RecBC inhibitor, Red*/ (Murphy, J. Bacterid. 1 73 (1 991 ), 5808-5821 ), was 
included. Second, the recombinogenic potential of RecE/RecT, or 
Reda/RedG>. was regulated by placing recE or reda under an inducible 
promoter. Consequently ET cloning can be induced when required and 
undesired recombination events which are restricted at other times. Third, 
we observed that ET cloning efficiencies are enhanced when RecT, or Redft, 
but not RecE, or Reda, is overexpressed. Therefore we placed recT, or redfi. 
under the strong, constitutive, EM7 promoter. 

pBAD-ETy was transformed into N331 45 E.coli harboring the original Hoxa 
P1 episome (Fig . 1 la). A region in the P1 vector backbone was targeted by 
PCR amplification of the cnlorampnenicol resistance gene (cm) flanked by 
n and p homology arms. As described above for positively selected ET 
cloning reactions, most (> 90%) chloramphenicol resistant colonies were 
correct. Notably, the overati efficienc . zf ET cloning., in terms of linear DNA 
transformed, was nearly three times setter using pBAD-ET^ than with 
similar experiments based on targeting the same episome in the sbcA host, 
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JC9604. This is consistent with our observation that overexpression of 
RecT improves ET cloning efficiencies. 

A comparison between ET cloning efficiencies mediated by RecE/RecT, 
expressed from pBAD-ETy, and Reda/Redft, expressed from pBAD-aftv was 
made in the recA-, recBC +■ E.coli strain, DK1 (Fig. 1 2). After transformation 
of E.coli DK1 with either pBAD-ET)/ or pBAD-<7&j,\.the same experiment as 
described in Figure 6a. c, to replace the bia gene of the pBAD vector with 
a chloramphenicol gene was performed. Both pBAD-ETy or pBAD-of3>K 
:o presented similar ET cioning efficiencies in terms of responsiveness to 
arabinose induction o? RecE and Reda, and number of targeted events. 
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Claims 

A method for cloning DNA molecules in cells comprising the steps of: 

a) providing a host cell capable of performing homologous 
recombination, 

b) contacting in said host ceil a first DNA molecule which is 

r ir^nln r Koinn roniir'^tari i r> Qtairf hnet d 1 1 \A/lth ^ COrnnH 
otlpuui^ wi v_ 1 1 iy f ^|jii^uluu Hi O <-* i u • » w «j i w 1 1 » » ■ »• • « — w w • • 

DNA molecule comprising at least two regions of sequence 
homology to regions on the first DNA molecule, under 
conditions which favour homologous recombination between 
said first and second DNA molecules and 

c) selecting a host cell in which homologous recombination 
between said first and second DNA molecules has occurred. 

The method according to claim 1 wherein the homologous 
recombination occurs via the recET cloning mechanism. 

The method according to claim 2 wherein the host cell is capable of 
expressing recE and recT genes. 

The method according to claim 3 wherein the recE and recT genes 
are selected from E.coli recE and recT genes or from A reda and redfc 
genes. 

The method according to claim 3 or 4 wherein the host cell is 
transformed with at least one vector capable of expressing recE 
and/or recT genes. 

The method of claim 3,4 or 5 wherein the expression of the recE 
and/or recT genes is under control of a regulatable promoter. 
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The method of claim 5 or 6 wherein the recT gene is overexpressed 
versus the recE gene. 

The method according to any one of claims 3 to 7 wherein the recE 
gene is selected from a nucleic acid molecule comprising 

(a) the nucleic acid sequence from position 1320 (ATG) to 2159 
(GAC) as depicted in Fig.7B, 

(b) the nucleic acid sequence from position 1320 (ATG) to 1998 
(CGA) as depicted in Fig.13B, 

(c) a nucleic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(d) a nucleic acid sequence which hybridizes under stringent 
conditions with the nucleic acid sequence from (a), (b) and/or (c). 

The method according to any one of claims 3 to 8 wherein the recT 
gene is selected from a nucleic acid molecule comprising 

(a) the nucleic acid sequence from position 2155 (ATG) to 2961 
(GAA) as depicted in Fig.7B, 

(b) the nucleic acid sequence from position 2086 (ATG) to 2863 
(GCA) as depicted in Fig.13B, 

(c) a nucieic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(d) a nucleic acid sequence which hybridizes under stringent 
conditions with the nucleic acid sequences from (a), (b) and/or (c). 

The method according to any one of the previous claims wherein the 
host cell is a gram-negative bacterial cell. 

The method according to claim 10 wherein the host cell is an 
Escherichia coli cell. 
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12. The method according to claim 11 wherein the host cell is an 
Escherichia coli K12 strain. 

13. The method according to claim 12 wherein the E.coli strain is 
5 selected from JC 8679 and JC 9604. 

nf thf» nrfivious claims wherein the 
host cell further is capable of expressing a recBC inhibitor gene. 

io 15. The method according to claim 14 wherein the host cell is 
transformed with a vector expressing the recBC inhibitor gene. 

1 6. The method according to claim 1 4 or 15 wherein the recBC inhibitor 
gene is selected from a nucleic acid molecule comprising 
is (a) the nucleic acid sequence from position 3588 (ATG) to 4002 

(GTA) as depicted in Fig.13B, 

(b) a nucleic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(c) a nucleic acid sequence which hybridizes under stringent 
20 conditions (as defined above) with the nucleic acid sequence from (a) 

and/ or (b). 

17. The method according to any one of claims 1 3 to 16 wherein the 
host cell is a prokaryotic recBC-r cell. 

25 

1 8. The method according to any one of the previous claims wherein the 
first DNA molecule is circular. 

1 9. The method according to any one of the previous claims wherein the 
30 first DNA molecule is an extrachromosomal DNA molecule containing 

an origin of replication which is operative in the host cell. 
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20. The method according to claim 18 or 19 wherein the first DNA 

molecule is selected from plasmids, cosmids, PI vectors, BAC 
vectors and PAC vectors. 

21 . The method according to any one of claims 1-18 wherein the firs: 
DNA molecule is a host cell chromosome. 

22. The method according to any one of the previous claims wherein the 
second DNA molecule is linear. 

23. The method according to any one of the previous claims wherein the 
regions of sequence homology are at least 15 nucleotides each. 



24. The method according to one of claims 1 to 1 6 wherein the second 
is DNA molecule is obtained by an amplification reaction. 

25. The method according to one of the previous claims wherein the first 
and/or second DNA molecules are introduced into the host cells by 
transformation. 

2 0 

26. The method according to ciaim 25 wherein the transformation 
method is electroporation. 

27. The method according to one of claims 1 to 26 wherein the first and 
25 second DNA molecules are introduced into the host ceil 

simultaneously by co-transformation. 

28. The method according to one of claims 1 to 26 wherein the second 
DNA molecule is introduced into a host cell in which the first DNA 

30 molecule is already present. 
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43. A reagent kit for cloning comprising 

(a) a host cell 

(b) means of expressing - recE and recT genes in said host celt and 

(c) a recipient cloning vehicle capable of being replicated in said cell. 



44. 



L5 



20 



25 



30 



The reagent kit according to claim 43 wherein the means (b) 
comprise a vector system capable of expressing the recE and red 
genes in the host cell. 

45. The reagent kit according to claim 43 or 44 wherein the recE and 
recT genes are selected from E.coli recE and recT genes or from A 
reda and redfi genes. 

46. A reagent kit for cloning comprising 

(a) a source for RecE and RecT proteins and 

(b) a recipient cloning vehicle capable of being propagated in a host 
cell. 

47. The reagent kit according to claim 46 further comprising a host cell 
suitable for propagating said recipient cloning vehicle. 

48. The reagent kit according to claim 46 or 47 wherein said RecE and 
RecT or proteins are selected from E.coli RecE and RecT proteins or 
from phage A Reda'and Redfi proteins. 

49. The reagent kit according to any one of claims 43-48 further 
comprising means for expressing a site specific recombinase in said 
host cell. 

50. The reagent kit according to any one of claims 43-49 further 
comprising nucleic acid amplification primers comprising a region of 
homology to said recipient cloning vehicle. 
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Figure 4b 
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Figure 6 
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Figure 7b 
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Figure 7b (cont'd) 
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Figure 7b (cont'd) 
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1398 GGT ATC AGT AAG TCT CAG CTC GAT GAC ATT GCT 

SUBSTITUTE SHEET (RULE 26) 

BNSOOC10: <WO 9929837A2 J_> 



WO 99. 29S3" 

Figure 7b (cont'd) 

27 ► Gl y lie 

1431 GAT. ACT 

38^Asp Thr 

1464 GCC CCC 

49^ A I a Pro 

1497 GAT TTA 

60^Asp Leu 

1530 GAA CCG 

71 ► Gl u Pro 

1563 GCA CCT 

82 ► A I a P ro 

1596 AAA GAA 

93 ► Lys Gl u 

1629 GCA AGC 

104 ► Al a Ser 

1662 GAA GGC 

115 ► Gl u Gl y 



1 5/65 

Ser Lys Ser 

CCG GCA CTA 

Pro Ala Leu 
GTG GAC ACC 

Va I Asp Th r 

GGA ACT GCT 

Gl y Th r Ala 
EcoRI 

GAA GAA TTC 

Gl u Gl u Phe 
GAA TTT AAC 

Gl u Phe As n 

GAA 'GAG AAA 

Gl u Gl u Lys 

ACA -GGA AAA 

Thr Gl y Lys 
CGG AAA ATT 

A rg Ly s Me 



Gl n Leu Asp 

TAT TTG TGG 

Tyr Leu T rp 

ACA AAG ACA 

Thr Lys Thr 

TTC CAC TGC 

Phe Hi s Cys 

AGT AAC CGC 

Ser Asn A rg 

CGC CGT ACA 

A rg A rg Th r 

GCG TTT CTG 

Ala Phe Leu 

ACG GTT ATC 

Thr Val lie 

GAA CTC ATG 

Gl u Leu Me t 
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Asp lie Ala 

CGT AAA AAT 

A rg Lys Asn 

AAA ACG CTC 

Lys Thr Leu 

CGG GTA CTT 

A rg Va I Leu 
TTT ATC GTA 

Phe He Val 

AAC GCC GGA 

Asn Ala Gl y 

ATG GAA TGC 

Met Gl u Cys 

ACT GCG GAA 

Th r Ala Gl u 

TAT CAA AGC 

Tyr Gl n Ser 
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Figure 7b (cont'd) 
1695 GTT ATG GCT TTG CCG 

126^Val Met Ala Leu Pro 

1728 GAA AGC GCC GGA CAC 

137> Gl u Ser Ala Gl y Hi s 
1761 TGG GAA GAT CCT GAA 

148>Trp Gl u Asp Pro Gl u 

1794 TGC CGT CCG GAC AAA 

159>Cys A rg Pro Asp Lys 

1827 TGG ATC ATG GAC GTG 

170>Trp I I e Met Asp Va I 

1860 CAA CGA TTC AAA ACC 

1S1> Gin A rg Phe Lys Thr 
1893 TAT CAC GTT CAG GAT 

192^ Tyr Hi s Va I Gl h Asp 

1926 TAT GAA GCA CAG TTT 

203> Ty r Gl u Ala Gl n Phe 
1959 GTT TTT CTG GTT GCC 

214>Val Phe Leu Va I Ala 

1992 GGA CGT TAT CCG GTT 
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CTG GGG CAA TGG CTT GTT 

Leu Gl y Gin Trp Leu Va I 

GCT GAA TCA TCA ATT TAC 

Ala Gl u Se r Se r Me T y r 
ACA GGA ATT TTG TGT CGG 

Thr Gl y lie Leu Cys A rg 

ATT ATC CCT GAA TTT CAC 

Me Me Pro Gl u Phe Hi s 

AAA ACT ACG GCG GAT ATT 

Lys Thr Thr Ala Asp Me 

GCT TAT TAC GAC TAC CGC 

Ala Tyr Tyr Asp Tyr A rg 
GGA TTC TAC AGT GAC GGT 

Ala Phe tyr Ser Asp Gl y* 

GGA GTG CAG CCA ACT TTC 

Gl y Val Gl n Pro Thr Phe 
AGC ACA ACT ATT GAA TGC 

Ser Thr Thr Me Gl u Cys 

GAA ATT TTC ATG ATG GGC 
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Figure 7b (cont'd) 

225^ Gl y A rg 

2025 GAA GAA 

236^ Gl u Gl u 

2058 CAC CGC 

2A1> Hi s A rg 

2091 AAT ACC 

258^Asn Th r 
2124 TCA CTG 

269^ Ser Leu 

2155 ATG ACT 
l^Met Thr 
279^s nAs p 

2188 CTG CAA 
12 ► Leu Gl n 

2221 GCA GTT 
23 ►Al a Val 

2254 AAC CAG 
34>Asn Gl n 

2287 GCT CTT 
45^AI a Leu 



17/6E 

Tyr Pro Val 

GCA AAA CTG 

Ala Lys Leu 

AAT CTG CGA 

Asn Leu A rg 
Ball 

GAT GAA TGG 

Asp Gl u T rp 
CCC • CGC TGG 

Pro A rg T rp 

AAG CAA CCA 
Ly s Gl n Pro 

AAA ACT CAG 
Lys Thr Gl n 

AAA AAT AGC 
Lys Asn Ser 

CCA TCA ATG 
Pro Ser Met 
Ndel 

CCA CGC CAT 
P ro A rg Hi s 



Gl u 


1 1 e 


Phe 


GCA 


GGT 


CAA 


Al a 


Gi y 


Gl n 


ACC 


CTG 


TCT 


Thr 


Leu 


Ser 


CCA 


GCT 


ATT 


r ro 


a i a 


1 1 r\ 

\ I e 


GCT 


AAG 


GAA 


A 1 a 


i_y ^ 


Gl u 


CCA 


ATC 


GCA 


P ro 


1 1 e 


Al a 


GGA 


AAC 


CGT 


Gl y 


Asn 


A rg 


GAC 


GTG 


ATT 


Asp 


Val 


Me 


AAA 


GAG 


CAA 


Ly s 


Gl u 


Gl n 


ATG 


ACG 


GCT 


Met 


Thr 


Al a 



Me t Me t Gl y 

CAG GAA TAT 

Gl n Gl u Tyr 

GAC TGC CTG 

Asp Cys Leu 

AAG ACA TTA 

Lys Thr Leu 
TAT GCAA 



Tyr A I aA 



AAA 


GCC 


GAT 


Ly s 


Al a 


Asp 


GCA 


CCA 


GCA 


Al a 


P ro 


Al a 


AGT 


TIT 


ATT 


Ser 


Phe 


1 1 e 


CTG 


GCA 


GCA 


Leu 


Al a 


Al a 


GAA 


CGT 


ATG 


Gl u 


A rg 


Met 
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Figure 7b (cont'd) 



2320 ATC 


CGT 


ATC 


GCC 


ACC 


ACA 


GAA 


ATT 


CGT 


AAA 


GTT 


56^ 1 1 e 


A rg 


1 1 e 


Al a 


Thr 


Thr 


Gl u 


1 1 e 


A rg 


Ly s 


Val 


2353 CCG 


GCG 


TTA 


GGA 


AAC 


TGT 


GAC 


ACT 


ATG 


AGT 


TIT 


67 ► Pro 


Al a 


Leu 


Gl y 


Asn 


Cys 


Asp 


Thr 


Met 


Ser 


Phe 






GCZG 






TV /— i 


TGT 


r-rr/ — 1 TV 




CTC 


GGA 


78 ► Val 


Ser 


Al a 


1 I e 


Val 


Gl n 


Cys 


Ser 


Gl n 


Leu 


Gl y 


2419 CJLT 


GAG 


CCA 


GGT 


AGC 


GCC 


CTC 


GGT 


CAT 


GCA 


TAT 


89^Leu 


Gl u 


P ro 


Gl y 


Ser 


Al a 


Leu 


Gl y 


Hi s 


Al a 


Tyr 


2452 TTA 


CTG 


CCT. 


'i'l'l' 


GGT 


AAT 


AAA. 


AAC 


GAA 


AAG 


AGC 


100> Leu 


Leu 


P ro 


Phe 


Gl y 


Asn 


Ly s 


Asn 


Gl u 


Ly s 


Ser 


2485 GGT 


AAA 


AAG 


AAC 


GTT 


CAG 


CPA 


ATC 


ATT 


GGC 


TAT 


Gl y 


Ly s 


Ly s 


Asn 


Val 


Gl n 


Leu 


1 1 e 


1 1 e 


Gl y 


Tyr 


2518 CGC 


GGC 


ATG 


ATT 


GAT 


CTG 


GCT 


CGC 


CGT 


TCT 


GGT 


122 ►A rg 


Gl y 


Me t 


1 1 e 


Asp 


Leu 


Al a 


A rg 


A rg 


Ser 


Gl y 


2551 CAA 


ATC 


GCC 


AGC 


CTG 


TCA 


GCC 


CGT 


GTT 


GTC 


CG'T 1 


133 ► Gl n 


1 1 e 


Al a 


Ser 


Leu 


Ser 


Al a 


A rg 


Va 1 


Val 


A rg 


2584 GAA 


GGT 


GAC 


GAG 


TT'JJ 


AGC 


TTC 


GAA 


■' i" T IT 


GGC 


CTT 


144 ► Gl u 


Gl y 


Asp 


Gl u 


Phe 


Ser 


Phe 


Gl u 


Phe 


Gl y 


Leu 


2617 GAT 


GAA 


AAG 


TTA 


ATA 


CAC 


CGC 


CCG 


GGA 


GAA 


AAC 


155^Asp 


Gl u 


Ly s 


Leu 


1 1 e 


Hi s 


A rg 


P ro 


Gl y 


Gl u 


Asn 


2650 GAA 


GAT 


GCC 


CCG 


GTT 


ACC 


CAC 


GTC 


TAT 


GCT 


GTC 


166> Gl u 


Asp 


Al a 


P ro 


Val 


Thr 


Hi s 


Va 1 


Ty r 


Al a 


Val 


2683 GCA 


AGA 


CTG 


AAA 


GAC 


GGA 


GGT 


ACT 


CAG 


TIT 


GAA 


177^AI a 


A rg 


Leu 


Ly s 


Asp 


Gl y 


Gl y 


Thr 


Gl n 


Phe 


Gl u 


2716 GTT 


ATG 


ACG 


CGC 


AAA 


CAG 


ATT 


GAG 


CTG 


GTG 


CGC 


188 ► Val 


Met 


Thr 


A ra 


Lvs 


Gl n 


1 1 e 


Gl u 


Leu 


Val 


A rg 
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Figure 7b (cont'd) 









AHA 






21 AT 1 










199 ► Ser 


Leu 


Ser 


Lv s 


A 1 a 


Gl V 


Asn 


Asn 


Gl v 


P ro 


T ro 


2782 GTA 


ACT 


V — 


TGG 


GAA 


GAA 


ATG 


GCA 


AAG 


AAA 


ACG 


210 ► Val 


Thr 


Hi s 


Trp 


Gl u 


Gl u 


Met 


Ala 


Lys 


Lys 


Thr 


2815 GCT 


ATT 


CGT 


CGC 


CTG 


TTC 


AAA 


TAT 


TTG 


CCC 


GTA 


221> Al a 


1 1 e 


A rg 


A rg 


Leu 


Phe 


Lys 


Tyr 


Leu 


P ro 


Val 


2848 TCA 


AIT 


GAG 


ATC 


CAG 


CGT 


GCA 


GTA 


TCA 


ATG 


GAT 


232^ Ser 


1 1 e 


Gl u 


1 1 e 


Gl n 


A rg 


Al a 


Val 


Ser 


Met 


Asp 


















Pstl 




2881 GAA 


AAG 


GAA 


CCA 


CTG 


ACA 


ATC 


GAT 


CCT 


GCA 


GAT 


243 ► Gl u 


Ly s 


Gl u 


P ro 


Leu 


Thr 


lie 


Asp 


P ro 


Al a 


Asp 


2914 TCC 


TCT 


GTA 


TTA 


ACC 


GGG 


GAA 


TAC 


AGT 


GTA 


ATC 


254^ Ser 


Ser 


Val 


Leu 


Thr 


Gl y 


Gl u 


Tyr 


Ser 


Val 


1 1 e 












Bgll 


1 


Hindi 


II 






2947 GAT 


AAT 


TCA 


GAG 


GAA 


TAG 


ATCTAAGCTT 






265> Asp 


Asn 


Ser 


Gl u 


Gl u 


• • • 













2975 GGCTGTTTTG GCGGATGAGA GAAGATTTTC AGCCTGATAC 



3 015 AGATTAAATC AGAACGCAGA AGCGGTCTGA TAAAACAGAA 

3055 TTTGC CTGGC GGCAGTAGCG CGGTGGTCCC ACCTGACCCC 

3095 ATGCCGAACT CAGAAGTGAA ACGCCGTAGC GCCGATGGTA 

3135 GTGTGGGGTC TCCCCATGCG AGAGTAGGGA ACTGCCAGGC 

3175 ATCAAATAAA ACGAAAGGCT CAGTCGAAAG ACTGGGCCTT 

3215 TCGTTTTATC TGTTGTTTGT CGGTGAACGC TCTCCTGAGT 

3255 AGGACAAATC CGCCGGGAGC GGATTTGAAC GTTGCGAAGC 

3295 AACGGCCCGG AGGGTGGCGG GCAGGACGCC CGCCATAAAC 

333 5 TGCCAGGCAT CAAATTAAGC AGAAGGCCAT CCTGACGGAT 
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Figure 7b (cont'd) 

3375 GGCCTTTTTG CGTTTCTACA AACTCTTTTG TTTATTTTTC 

3415 TAAATACATT CAAATATGTA TCCGC TCATG AGACAATAAC 

3455 CCTGATAAAT GCTTCAATAA TATTGAAAAA GGAAGAGT AT 

3>Me 



3495 G AGT ATT CAA CAT TTC CGT GTC GCC CTT ATT 
!► t Ser lie Gin His Phe A rg Va I Ala Leu Me 



3526 CCC 


TIT 


TIT 


GCG 


GCA 


'I'lT 


TGC 


err 


CCT 


GTT 


TIT 


±z ~ ro 


JTU 6 


nu ~ 

r^n e 


a i a 


a i a 


rh e 


Cys 


Leu 


P ro 


Val 


Phe 


3559 GCT 


CAC 


CCA 


GAA 


ACG 


CTG 


GTG 


AAA 


GTA 


AAA 


GAT 


^ J ~ A I a 


ril s 


P ro 


Gl u 


Th r 


Leu 


Va 1 


Ly s 


Va 1 


Lys Asp 


3592 GCT 


gaa 


GAT 


CAG 


TTG 


GGT 


GCA 


CGA 


GTG 


GGT 


TAC 


34»AI a 


Gl u 


Asp 


Gl n 


Leu 


Gl y 


Al a 


A rg 


Val 


Gl y 


Tyr 


3 625 ATC 


GAA 


CTG 


GAT 


CTC 


AAC 


ACC 




AAr; 


ATC 


CTT 


45 ► 1 1 e 


Gl u 


Leu 


Asp 


Leu 


Asn 


Ser 


Gl y 


Ly s 


1 1 e 


Leu 


3 658 GAG 


AGT 


TIT 


CGC 


CCC 


GAA 


GAA 


CGT 


TTT 

X X X 


CCA 


ATG 


56^ Gl u 


Ser 


Phe 


A rg 


P ro 


Gl u 


Gl u 


A ra 


Phe 


P ro 


Met 


3 691 ATG 


AGC 


ACT 


'i'lT 


AAA 


GTT 


CTG 


CTA 


TGT 


GGC 


GCG 


67 ►Met 


Ser 


Thr 


Phe 


Ly s 


Val 


Leu 


Leu 


Cys 


Gl y 


Al a 


3724 GTA 


TTA 


TCC 


CGT 


GTT 


GAC 


GCC 


GGG 


CAA 


GAG 


CAA 


78^ Val 


Leu 


Ser 


A rg 


Val 


Asp 


Al a 


Gl y 


Gl n 


Gl u 


Gl n 


3757 CTC 


GGT 


CGC 


CGC 


ATA 


CAC 


TAT 


TCT 


CAG 


AAT 


GAC 


89 ► Leu 


Gl y 


A rg 


A rg 


1 1 e 


His 


Tyr 


Ser 


Gl n 


Asn 


Asp 






Seal 
















3790 TTG 


GTT 


GAG 


TAC 


TCA 


CCA 


GTC 


ACA 


GAA 


AAG 


CAT 


100 ► Leu 


Val 


Gl u 


Tyr 


Ser 


Pro 


Val 


Thr 


Gl u 


Lys 


Hi s 


3823 CTT 


ACG 


GAT 


GGC 


ATG 


ACA 


GTA 


AGA 


GAA 


TTA 


TGC 


111* Leu 


Thr 


Asp 


Gl y 


Me t 


Thr 


Val 


A rg 


Gl u 


Leu 


Cys 
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Figure 7b (cont'd) 

3 856 AGT GCT GCC 
122 ► Ser A I a A I a 

3 889 GCC AAC TTA 
133 ►Ala Asn Leu 

3 922 AAG GAG CTA 
144> Lys Gl u Leu 

3955 GAT CAT GTA 
155>Asp His Val 

3988 GAG CTG AAT 
166^ Gl u Leu Asn 

4021 GAC ACC ACG 
177 ►Asp Thr Thr 

4054 TTG CGC AAA 
188^ Leu A rg Lys 

4087 CTA GCT TCC 
199^ Leu Al a Ser 

4120 GAG GCG GAT 
210^ Gl u Ala Asp 

4153 TCG GCC CTT 
221^ Ser Al a Leu 

4186 AAA TCT GGA 
232^ Lys Ser Gl y 

4219 ATC ATT GCA 
243 ► lie I I e A I a 

4252 TCC CGT ATC 
254^ Ser A rg Me 
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ATA 


ACC 


ATG 


AGT 


1 1 e 


Thr 


Me t 


Ser 


CTT 


CTG 


ACA 


ACG 


Leu 


Leu 


Thr 


Thr 


ACC 


GCT 


TIT 


TTG 


Thr 


Al a 


Phe 


Leu 


ACT 


CGC 


CTT 


GAT 


Thr 


A rg 


Leu 


Asp 


GAA 


GCC 


ATA 


CCA 


Gl u 


Al a 


1 1 e 


Pro 


ATG 


CCT 


GTA 


GCA 


Met 


P ro 


Val 


Al a 


CTA 


TTA 


ACT 


GGC 


Leu 


Leu 


Thr 


Gl y 


CGG 


CAA 


CAA 


TTA 


A rg 


Gl n 


Gl n 


Leu 


AAA 


GTT 


GCA 


GGA 


Lys 


Val 


Al a 


Gl y 


CCG 


GCT 


GGC 


TGG 


P ro 


Al a 


Gl y 


Trp 


GCC 


GGT 


GAG 


CGT 


Al a 


Gl y 


Gl u 


A rg 


GCA 


CTG 


GGG 


CCA 


Al a 


Leu 


Gl y 


P ro 


GTA 


GTT 


ATC 


TAC 


Val 


Val 


1 1 e 


Tyr 



GAT 


AAC 


ACT 


GCG 


Asp 


Asn 


Thr 


Al a 


ATC 


GGA 


GGA 


CCG 


1 1 e 


Gl y 


Gl y 


P ro 


CAC 


AAC 


ATG 


GGG 


Hi s 


Asn 


Met 


Gl y 


CGT 


TGG 


GAA 


CCG 


A rg 


Trp 


Gl u 


P ro 


AAC 


GAC 


GAG 


CGT 


Asn 


Asp 


Gl u 


A rg 


ATG 


GCA 


ACA 


ACG 


Met 


Al a 


Thr 


Thr 


GAA 


CTA 


CTT 


ACT 


Gl u 


Leu 


Leu 


Thr 


ATA 


GAC 


TGG 


ATG 


1 1 e 


Asp 


Trp 


Met 


CCA 


CTT 


CTG 


CGC 


P ro 


Leu 


Leu 


A rg 


TIT 


ATT 


GCT 


GAT 


Phe 


1 1 e 


Al a 


Asp 


GGG 


TCT 


CGC 


GGT 


Gl y 


Ser 


A rg 


Gl y 


GAT 


GGT 


AAG 


ccc 


Asp 


Gl y 


Lys 


P ro 


ACG 


ACG 


GGG 


AGT 


Thr 


Thr 


Gl y 


Ser 
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Figure 7b (cont'd) 

4285 CAG GCA ACT ATG GAT GAA CGA AAT AGA CAG ATC 
265^ Gl n Ala Thr Me t Asp Gl u A rg Asn A rg Gl n II e 

4318 GCT GAG ATA GGT GCC TCA CTG ATT AAG CAT TGG 
276^Ala Gl u Me Gl v Ala Ser Leu Me Lvs His Ttd 

4351 TAA CTGTCAGACC AAGTTTACTC ATATAT AC TT 



9S7 ► 

jL O / w 


• • • 








A1RA 


T'AriA'P'TY^A'P P 


X <TlX_ VJV^. 0\— L \— X 




n.1 X.£"Lr\w\~\jV_\jr 


A AO A 




r TYA / "^T r r A HCzr^C 
XVjt^ x X^-1^.vj\^v_t 




rrTAriirTTr 

vjt^ Xi-iL^riv— X ±\j 


AA&A 






L^^w X X X\— LjL^ X X 


X X\— ^v.* X XVw 


A^OA 


X X X ^ ^_ V — ^JJ\^_ V^. 




v_JV^ X X X V*-^V-^V^Vjt 


TP A A f^rTTTa 


A^AA 

f± Xv f±f± 




X ^ X X X rivjna 


\j X X v^Vw^-xi-l X X X 


j-iv^r X XXX riu 


fx X? 0*x 




rrrrA a a a a a 


X X v-xri. X X XVjt^jj 


w X VJXrt. X Ow X X Vw> 




w X jTi-w X UvU 




Ci A T A n A P fATT 


X X X X v. v_ Vw- X 


400*3 


j- Xw«r^\— wX X w\j 


a nTrr a p , r; r T Y T 

nu x X X 


\_ X X X ^-i-ri X X 


X Vw X X \-J X 


4704 


TCCAAACTTG 


AACAACACTC 


AACCCTATCT 


CGGGCTATTC 


4744 


TTTTG ATTTA 


"TAAGGGATTT 


TGCCGATTTC 


GGCCTATTGG - 


4784 


TTAAAAAATG 


AGCTGATTTA 


ACAAAAATTT 


AACGCGAATT 


4824 


TTAACAAAAT 


ATTAACGTTT 


ACAAT1TAAA 


AGGATCTAGG 


4864 


TGAAGATCCT 


TTTTGATAAT 


CTCATGACCA 


AAATCCCTTA 


4904 


ACGTGAGTTT 


TCGTTCCACT 


GAGCGTCAGA 


CCCCGTAGAA 


4944 


AAGATCAAAG 


GATCTTCTTG 


AGATCCTTTT 


TTTCTGCGCG 


4984 


TAATCTGCTG 


CTTGCAAACA 


AAAAAACCAC 


CGCTACCAGC 


5024 


GGTGGTTTGT 


TTGCCGGATC 


AAGAGCTACC 


AACTCTTTTT 
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Figure 7b (com'd) 



5064 


CCGAAGGTAA 


CTGGCTTCAG 


CAGAGCGCAG 


ATACCAAATA 


5104 


CTGTCCTTCT 


AGTGTAGCCG 


TAGTTAGGCC 


ACCACTTCAA 


5144 


GAACTCTGTA 


GCACCGCCTA 


CATACCTCGC 


TCTGCTAATC 


5184 


CTGTT AC CAG 


TGGCTGCTGC 


CAGTGGCGAT 


AAGTCGTGTC 


5224 


TTACCGGGTT 


GGACTCAAGA 


CGATAGTTAC 


CGGATAAGGC 


5264 


GCAGCGGTCG 


GGCTGAACGG 


GGGGTTCGTG 


CACACAGCCC 


5304 


AGCTTGGAGC 


GAACGACCTA 


CAC CGAACTG 


AGATACCTAC 


5344 


AGCGTGAGCT 


ATGAGAAAGC 


GCCACGCTTC 


CCGAAGGGAG 


5384 


AAAGGCGGAC 


AGGTATCCGG 


TAAGCGGCAG 


GGTCGGAACA 


5424 


GGAGAGCGCA 


CGAGGGAGCT 


TCCAGGGGGA 


AACGCCTGGT 


5464 


ATCTTTATAG 


TCCTGTCGGG 


TTTCGCCACC 


TCTGACTTGA 




GCGTCGATT 71 


TTGTGATGCT 


CGTCAGGGGG 


GCGGAGCCTA 


.j .j *±*± 


TGG AAAAACG 


C CAGC AACGC 


ggccjtitita 


CGGTTCCTGG 


J JOI 


V— >w X -i- X V_ * V— X VJ 


GCCTTTTGCT 


CACATGTTCT 


TTCCTGCGTT 


J vj j£ *± 


ATCCCCTGAT 


TCTGTGGATA 


ACCGTATTAC 


CGCCTTTGAG 


5664 


TGAGC TG ATA 


CCGCTCGCCG 


CAGCCGAACG 


ACCGAGCGCA 


5704 


GCGAGTCAGT 


GAGCGAGGAA 


GCGGAAGAGC 


GCCTGATGCG 


CT '"7 ^ A 


blAl 1 x IL IL 






TTCi^ CACCGC 


5784 


ATAGGGTCAT 


GGCTGCGCCC 


CGACACCCGC 


CAACACCCGC 


5824 


TGACGCGCCC 


TGACGGGCTT 


GTCTGCTCCC 


GGCATCCGCT 


5864 


TACAGACAAG 


CTGTGACCGT 


CTCCGGGAGC 


TGCATGTGTC 


5904 


AGAGGTTTTC 


ACCGTCATCA 


CCGAAACGCG 


CGAGGCAGCA 


5944 


AGGAGATGGC 


GCCCAACAGT 


CCCCCGGCCA 


CGGGGCCTGC 
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FiQurs 7 fa {cent d) 

5984 CACCATACCC ACGCCGAAAC AAGCGCTCAT GAGCCCGAAG 

6024 TGGCGAGCCC GATCTTC CCC ATCGGTGATG TCGGCGATAT 

6064 AGGCGC CAGC AACCGCACCT GTGGCGCCGG TGATGCCGGC 

6104 CACGATGCGT CCGGCGTAGA GGATCTGCTC ATGTTTGACA 
6144 GCTTATC 
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Figure 9a 
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Figure 1 3b 

1 ATCGATGCATAATGTGCCTGTCAAATGKjACGAAGCAGGG 
4 0 ATTCTGCAAACCCTATGCTACTCCGTCAAGCCGTC AATT 



79 GTCTGATTCGTTACCAA TTA TGA CAA CTT GAC 

293<«« # Ser Leu Lys Va I 



111 GGC 


TAC 


ATC 


ATT 


CAC 


(i t [ i n 

X J, A. 


TTC . 


TTC 


ACA 


ACC 


288 < A 1 a 


Va 1 


Asp 


Asn 


Va 1 


Ly s 


Gl u 


Ql u 


Cys 


ui y 


141 GGC 


ACG 


GAA 


CTC 


GCT 


CGG 


GCT 


GGC 


ccc 


GGT 


278< Al a 


A rg 


Phe 


Gl u 


Ser 


P ro 


Ser 


A I a 


Gl y 


I n r 


171 GCA 


TIT 


'ITT 


AAA 


TAC 


CCG 


cga 


GAA 


ATA 


GAG 


268< Cys 


Ly s 


Ly s 


Phe 


Va 1 


A rg 


Ser 


Ph e 


Ty r 


Leu 


201 TTG 


ATC 


GTC 


AAA 


ACC 


AAC 


ATT 


GCG 


ACC 


GAC 


258 < Gl n 


Asp 


Asp 


Phe 


Gl y 


Va 1 


Asn 


A rg 


Gl y 


va i 


231 GGT 


GGC 


GAT 


AGG 


CAT 


CCG 


GGT 


GGT 


GCT 


CAA 


248 < Th r 


A 1 

Al a 


1 1 e 


P ro 


ft k ^ i. 

Me t 


A rg 


Th r 


In r 


bG r 


Lg u 


261 AAG 


^ 


CTT 


CGC 


CTG 


GCT 


GAT 




TTG 


GTC 


238 < Leu 


Leu 


Ly s 


A 1 

A 1 a 


Gl n 


Se r 


1 1 e 


A rg 


Lai n 


A S p 


291 CTC 


GCG 


CCA 


GCT 


TAA 


GAC 


GCT 


AAT 


CCC 


TAA 


228< Gl u 


A rg 


Trp 


Ser 


Leu 


Va 1 


Se r 


Me 


Gl y 


L e u 


321 CTG 


CTG 


GCG- 


GAA 


AAG 


ATG 


TGA 


CAG 


ACG 


CGA 


218< Gl n 


Gl n 


A rg 


Phe 


Leu 


Hi s 


Ser 


Leu 


A rg 


Ser 


351 CGG 


CGA 


CAA 


GCA 


AAC 


ATG 


CTG 


TGC 


GAC 


GCT 


208< Pro 


Ser 


Leu 


Cys 


Val 


Hi s 


Gl n 


Al a 


Val 


Ser 


381 GGC 


GAT 


ATC 


AAA 


ATT 


GCT 


GTC 


TGC 


CAG 


GTG 


198<AI a 


1 1 e 


Asp 


Phe 


Asn 


Ser 


Asp 


Al a 


Leu 


Hi s 


411 ATC 


GCT 


GAT 


GTA 


CTG 


ACA 


AGC 


CTC 


GCG 
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Figure 1 3b (cont'd) 



188 < Asd 


Se r 


1 1 e 


Tv r 


Gl n 


Cvs 


A 1 a 


Gl u 


A ra 


Va 1 


44-L CCLt 


AI I 


ATC 


CAT 


CGG 


TGG 


ATG 


GAG 


CGA 


CTC 


1 78 < A ra 


Asn 


A ^ n 


Me t 


P ro 


P ro 

1 1 V—' 


Hi s 

III o 


I pu 

L. LJ 


Ser 


Gl u 


/I ^ 1 TIM 






1 I v_ 


CAi 


GCG 




/^A 


IAA 


GAA 


168 4 Asn 


1 1 e 


AI a 


Gl u 


Me t 


A ra 


A ra 

/-a i y 


Leu 


Leu 


Leu 


cm mi! 

DUX Ilia 


C IC 


A 7V ^ 


CAG 


A Mill 

AI I 


rnA rp 


CGC 


CAG 


CAG 


C IC 


158 4 Gl n 


Gl u 


Leu 

k— W LJ 


leu 

L. Li 


Asn 


I 1 e 
i i w 


A 1 a 


1 p u 

v> LJ 


Leu 

L. \m* \Jk 


Gl u 

1 Li 


jji CGA 


TV rp7\ . 
£\ I .A 


/~ *^/— * 


CCC 




CCC 


I IG 




GG\~ 


/T-prp 
LjI I 


148 4 Ser 


Tv r 

i y i 


A ra 


Gl v 


Gl u 


Gl v 

v^i y 


Gl n 


Gl v 


AI a 


Asn 


DDI AAi 


LiAI 


Mil V"^ 


CCC 


AAA 

AAA 




LjIL. 


LtC I 




AIVj 


138 < 1 1 e 


1 1 e 


Gl n 


Gl v 


Phe 


Leu 


Asp 


Se r 


Phe 


Hi s 


by ± cgg 


C Id 


GlCr 


CGC 


(III V"**' 


AIC 


L.CJG 


GCvjt 


AAA 

AAA 


a a 
UAA 


128 4 Pro 


Gl n 


Hi s 

III x_> 


AI a 

/ \ 1 u 


Gl u 


A *^ n 


P ro 


A ro 


Phe 


Phe 


621 CCu 


CGT 


A-TT 




T\ "A TV 

AAA 




TGA 


Cv^tG 


CCA 




lie ^ V3i y 


Th r 


Aon 




Ph p 
nic 


1 1 P 


9p r 


P ro 


T ro 


A ^ n 


651 AAG 


CCA 




ATG 




GTA* 




G_G 


C^^r 


"■\ ' — ' 


1 n P 4 1 on 


T rn 

i rp 


v3l U. 


Hi c 
n i o 


T r n 

i rp 


i y r 


Mid 


A r n 


P rn 
i i c 


A rn 


681 AAA 


GTA 


AAC 


CCA 


CTG 


GTG 


ATA 


CCA 


TTC 




QO ^ ph p 


T \i r 
i y r 


Vfl 1 
v d i 


T rn 
i rp 


HI n 


Hi 


T v r 
i y r 


T rn 


Gl u 


A rn 


711 AGC 


CTC 


CGG 


ATG 


ACG 


ACC 


GTA 


GTG 


ATG 


AAT 


88<AI a 


Gl u 


Pro 


Hi s 


A rg 


Gl y 


Tyr 


Hi s. 


Hi s 


1 1 e 


741 CTC 


TCC 


TGG 


CGG 


GAA 


CAG 


CAA 


AAT 


ATC 


ACC 


78 < Gl u 


Gl y 


Pro 


Pro 


Phe 


Leu 


Leu 


1 1 e 


Asp 


Gl y 


771 CGG 


TCG 


GCA 


AAC 


AAA 


TTC 


TCG 




CTG 


A T f T1 


68« Pro 


A rg 


Cys 


Val 


Phe 


Gl u 


A rg 


Gl y 


Gl n 


Asn 
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Figure 1 3b (cont'd) 



801 'LTV 


CAC 


CAC 


ccc 


CTG 

^ — X \_7 


ACC 


GCG 


AAT 


GGT 


GAG 


58 < Lys 


Val 


Val 


Gl y 


Gl n 


Gl y 


A rg 


1 1 e 


Thr 


Leu 


83 1 ATT 


GAG 


AAT' 




iff 


TIT 

X X J*. 


CAT 

^ — Cx. X 


X v — . 


CAG 


V— V_J\XT 


48 * Asn 


Leu 


I ! e 


Tvr 


n\ v 


Ly s 


Me t 


Gl y 


Leu 


P ro 


861 TCG 

^ W -X, X WVJ 


GTC 




AAA 


AAA 


ATC 


GAG 


ATA 


ACC 


GTT 


38<A rg 


Asp 


1 1 e 


Phe 


Phe 


Asp 


Leu 


Tyr 


Gl y 


Asn 


891 GGC 


CTC 


AAT 




CGT 


TAA 


ACC 


CGC 


CAC 


CAG 


28< Al a 


Gl u 


1 1 e 


P ro 


Thr 


Leu 


Gl v 


A 1 a 


Val 


Leu 


921 ATG 


GGC 


ATT 


AAA 


CGA 


GTA 


TCC 


CGG 


CAG 


CAG 


18< Hi s 


Al a 


Asn 


Phe 


Ser 


Tyr 


Gl y 


P ro 


Leu 


Leu 


951 GGG 


ATC 


ATT 


TTG 


CGC 


TTC 


AGC 


CAT 


ACTTTTC 


8« Pro 


Asp 


Asn 


Gl n 


Al a 


Gl u 


Al a 


Met 







982 ATACTCCCGCCATTCAGAGAAGAAACCAATTX3TCCATAT 



1021 TGCATCAGACATTGCCGTCACTGCGTCTTTTACTGGCTC 



1 0 60 TTCTCGCTAACCAAACCGGTAACCCCGCTTATTAAAAGC 



1099 ATTC TGTAACAAAGCGGGAC C AAAGCCATGACAAAAACG 



1138 CGTAACAAAAGTGTCTATAATCACGGCAGAAAAGTCCAC 



1177 ATTGATTATTTGCACGGCGTC ACACTTTGCTATGCCATA 



BamHI 

1216 GCATTTTTATCCATAAGATTAGCGGATCCTACCTGACGC 
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Figure 1 3b (cont'd) 

1255 TTTTTATCGC^CrTCTCTACTGTTTCTCCATACCCGTTT 

Nhel EcoRI Ncol BamHI 

1294 TTTTGGGCTAGCAGGAGGAAT TCACC AIG GAT CCC 

lHVIet Asp Pro 



1329 GTA 


ATC 


GTA 


GAA 


GAC 


ATA 


GAG 


CCA 


GGT 


ATT 1 


4^ Val 


1 1 e 


Val 


Gl u 


Asp 


1 1 e 


Gl u 


P ro 


Gl y 


1 i e 


1359 TAT 


TAC 


GGA 


ATT 


TCG 


AAT 


GAG 


AAT 


TAC 


CAC 


14^Tyr 


Tyr 


Gl y 


1 1 e 


Ser 


Asn 


Gl u 


Asn 


Tyr 


Hi s 


1389 GCG 


GGT 






ril v 






X v — X 






24> Al a 


Gl y 


r TO 


vjji y 


i i e 


oe r 


Ly s 




VJI I t 


I oil 


1419 GAT 


GAC 


ATT 


GCT 


GAT 


ACT 


CCG 


GCA 


CTA 


TAT 


34>Asp 


Asp 


1 1 G 


A 1 a 


AS p 


"Th r 

• i nr 




A 1 Q 

M 1 d 


1 Oil 


i y r 


1449 TTG 


TGG 


CGT 


AAA 


AAT 


GCC 


CCC 


GTG 


GAC 


AO., 


44 ► Leu 


Trp 


a rg 


Ly s 






D r r\ 

r 1 O 


Vd 1 


A cn 
nop 


Th r 

MM 


1479 ACA 


AAG 


ACA 


AAA 


ACG 


CTC 


GAT 


TTA 


GGA 


ACT 


54 ► Thr 


Ly s 


1 n r 


Ly s 


Th r 

i n r 


L. U 


A o o 
Mop 


1 Oil 
L- w U 


VJ3 1 y 


Th r 

1 1 1 1 


1509 GCT 


TTC 


CAC 


TGC 


CGG 


GTA 


CTT 


GAA 


CCG 


GAA 


64>A\ a 


Phe 


Hi s 


Cys 


A rg 


Val 


Leu 


Gl u 


P ro 


Gl u 


EcoRI 


















1539 GAA 


TTC 


AGT 


AAC 


CGC 


'ITT 


ATC 


GTA 


GCA 


CCT 


74 ► Gl u 


Phe 


Ser 


Asn 


A rg 


Phe 


1 1 e 


Val 


Al a 


P ro 


1569 GAA 


TIT 


AAC 


CGC 


CGT 


ACA 


AAC 


GCC 


GGA 


AAA 


84 ► Gl u 


Phe 


Asn 


A rg 


A rg 


Thr 


Asn 


Al a 


Gl y 


Ly s 


1599 GAA 


GAA 


GAG 


AAA 


GCG 


'ITT' 


CTG 


ATG 


GAA 


TGC 


94 ► Gl u 


Gl u 


Gl u 


Ly s 


Al a 


Phe 


Leu 


Met 


Gl u 


Cys 


1629 GCA 


AGC 


ACA 


GGA 


AAA 


ACG 


GTT 


ATC 


ACT 


GCO 


104^AI a 


Ser 


Thr 


Gl y 


Ly s 


Thr 


Val 


1 1 e 


Thr 
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Figure 1 3b (cont'd) 

1659 GAA GAA GGC CGG AAA ATT GAA CTC ATG TAT 
114 ►■Glu Glu Gly A rg Lys Me Glu Leu Met Tyr 

1689 GAA AGC GTT ATG GCT TTG CCG CTG GGG CAA 
j-^^x' win ucr vd i ivie i Mia Leu r ro Leu ui y ui n 

1719 TGG CTT GTT GAA AGC GCC GGA CAC GCT GAA 
134^ Trp Leu Va I Glu Ser Ala Gl y His Ala Glu 

1749 TCA TCA ATT TAC TGG GAA GAT CCT GAA ACA 
144 ► Ser Ser lie Tyr Trp Glu Asp Pro Glu Thr 

1779 GGA ATT TTG TGT CGG TGC CGT CCG GAC AAA 
154 ►Gly Me Leu Cys A rg Cys A rg Pro Asp Lys 

1809 ATT ATC CCT GAA TTT CAC TGG ATC ATG GAC 
164^ lie Me Pro Glu Phe His Trp lie Met Asp 

1839 GTG AAA ACT ACG GCG GAT ATT CAA CGA TTC 
174 ►Va I Lys Thr Thr Ala Asp Me Gl n A rg Phe 

1869 AAA ACC GCT TAT TAC GAC TAC CGC TAT CAC 
184^Lys Thr Ala Tyr Tyr Asp Tyr A rg Tyr His 

1899 GTT GAG GAT ' GCA TTC TAC AGT GAC GGT TAT 
194>Val Gin Asp Ala Phe Tyr Ser Asp Gl y Tyr 

1929 GAA GCA CAG TTT GGA GTG CAG CCA ACT TTC 
204>Glu Ala Gin Phe Gl y Va I Gin Pro Thr Phe 

1959 GTT TTT CTG GTT GCC AGC ACA ACT ATT GAA 
214^ Val Phe Leu Val Ala Ser Thr Thr Me Glu 

1989 TGC GGA CGT TAT CCG GTT GAA ATT TTC ATG 
224^ Cys Gl y A rg Tyr Pro Val Glu Me Phe Met 

2019 ATG GGC GAA GAA GCA AAA CTG GCA GGT CAA 
234>Met Gly Glu Glu Ala Lys Leu Ala Gl y Gin 
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Figure 1 3b (cont'd) 

2049 CAG GAA TAT CAC CGC AAT CTG CGA ACC CTG 
244> Gin Gl u Tyr His A rg Asn Leu A rg Thr Leu 

2079 TCT GAC TGC CTG AAT ACC GAT GAA TGG CCA 
254^ Ser Asp Cys Leu Asn Thr Asp Gl u T rp Pro 

2109 GCT ATT AAG ACA TTA TCA CTG CCC CGC TGG 
264^Ala Me Lys Thr Leu Ser Leu Pro A rg Trp 

Xhol Kpnl 

2139 GCT . AAG GAA TAT GCA AAT GAC TAGATCTCGAG 
274^Ala Lys Gl u Tyr Ala Asn Asp 

2171 G T AC C CGAG C ACGTGTTG AC AATTAATC ATCGG C ATAGT 

2210 ATATCGGCATAGTATAATACGAC AAGGTGAGGAACTAAA 
Ncol 

2249 CC ATG GCT AAG CAA CCA CCA ATC .GCA AAA 
l^Met Ala Lys Gin Pro Pro Me Ala Lys 



2278 GCC 


GAT 


CTG 


CAA 


AAA 


ACT 


CAG 


GGA 


AAC 


CGT 


10>AI a 


Asp 


Leu 


Gl n 


Ly s 


Thr 


Gl n 


Gl y 


Asn 


A rg 


2308 GCA 


CCA 


GCA 


GCA 


GTT 


AAA 


AAT- 


AGC 


■GAC 


GTG 


20>AI a 


P ro 


Al a 


Al a 


Val 


Ly s 


Asn 


Ser 


Asp 


Val 


2338 ATT 


AGT 


tit ' 


' ATT 


AAC 


CAG 


CCA 


TCA 


ATG 


AAA 


30> 1 1 e 


Ser 


Phe 


1 1 e 


Asn 


Gl n 


P ro 


Ser 


Met 


Ly s 


2368 GAG 


CAA 


CTG 


GCA 


GCA 


GCT 


CTT 


CCA 


CGC 


CAT 


40> Gl u 


Gl n 


Leu 


Al a 


Al a 


Al a 


Leu 


Pro 


A rg 


Hi s 


2398 ATG 


ACG 


GCT 


GAA 




ATG 


ATC 


CGT 


ATC 


GCC 


50^Met 


Thr 


Al a 


Gl u 


A rg 


Me t 


1 1 e 


A rg 


1 1 e 


Al a 


2428 ACC 


ACA 


GAA 


ATT 


^V^: 1 


Z» Z; 


GTT 


CCG 


GCG 


TTA 


60> Thr 


Thr 


Gl u 


1 1 e 


A rg 


Ly s 


Val 


P ro 


Al a 


Leu 
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Figure 1 3b (cont'd) 



2458 GGA 


AAC 


TGT 


GAC 


ACT 


ATG 


AGT 


TIT 


GTC 


AGT 


70 ► Gl y 


Asn 


Cys 


Asp 


Th r 


ft ^ 

Me t 


Ser 


Ph e 


va l 


be r 


2488 GCG 


ATC 


GTA 


CAG 


TGT 


TCA 


CAG 


CTC 


GGA 


CTT 


80 ► A 1 a 


1 1 e 


Va 1 


Gl n 


Cys 


Se r 


Gl n 


Leu 


\j\ y 


Leu 


2518 GAG 


CCA 


GGT 


AGC 


GCC 


CTC 


GGT 


CAT 


GCA 


TAT 


90 ► Gl u 


P ro 


Gl y 


Ser 


Al a 


Leu 


Gl y 


Hi s 


a i a 


\ y r 


2548 TTA 


CTG 


CCT 


.j_.j_.j_, 


GGT 


AAT 


AAA 


AAC 


GAA 


AAG 


100> Leu 


Leu 


P ro 


Phe 


Gl y 


Asn 


Ly s 


Asn 


Gl u 


Ly s 


2578 AGC 


GGT 


AAA 


AAG 


AAC 


GTT 


CAG 


CTA 


ATC 


ATT 


110 ► Ser 


Gl y 


Lys 


Ly s 


Asn 


Va 1 


Gl n 


Leu 


1 1 e 


1 I e 


2608 GGC 


TAT 


CGC 


GGC 


ATG 


ATT 


GAT 


CTG 


GCT 


CGC 


120 ► Gl y 


Tyr 


A rg 


Gl y 


Me t 


1 1 e 


A — ~ 

Asp 


Leu 


A 1 a 


A rg 


2638 CGT 


TCT 


GGT 


CAA 


ATC 


GCC 


AGC 


CTG 


TCA 


GCC 


13 0 ► A rg 


. Ser 


Gl y 


Gl n 


1 1 e 


A 1 a 


Ser 


Leu 


be r 


A 1 ^ 

a i a 


2668 CGT 


GTT 




CGT 


GAA 


GGT 


GAC 




tit 


AGC 


140^ A rg 


Va i 


Va 1 


A rg 


Gl u 


Gl y 


A - 

Asp 


Gl u 


rrl e 


C2/^ r 


2698 TTC 


GAA 


TIT 


GGC 


CTT 


GAT 


GAA 


AAG 


TTA 


ATA 


150 ► Phe 


Gl u 


Phe 


.Gl y 


Leu 


Asp 


Gl-U 


Ly s 


Leu 


I I e 


2728 CAC 


CGC 


CCG 


GGA 


GAA 


AAC 


GAA 


GAT 


GCC 


CCG 


160> Hi s 


A rg 


P ro 


Gl y 


Gl u 


Asn 


Gl u 


Asp 


a l a 


D r 

ro 


2758 GTT 


ACC 


CAC 


GTC 


TAT 


GCT 


GTC 


GCA 


AGA 




170 ► Val 


Thr 


Hi s 


Val 


Tyr 


Al a 


Val 


Al a 


A — 

A rg 


Leu 


2788 AAA 


GAC 


GGA 


GGT 




CA 


TIT 


GAA 


GTT 


ATG 


180> Lys 


Asp 


Gl y 


Gl y 


Thr 


Gl n 


Phe 


Gl u 


Val 


Met 


2818 ACG 


CGC 


AAA 


CAG 


ATT 




CTG 


GTG 


CGC 


AGC 


190 ►Thr 


A rg 


Lys 


Gl n 


1 1 e 


Gl u 


Leu 


Val 


A rg 


Ser 
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Figure 1 3b i cont'd} 



2848 CTG 


AGT 


AAA 


GCT 


GGT 


AAT 


AAC 


GGG 


CCG 


TGG 


/UU" Le u 


be r 


Ly s 


A i a 


ol y 


A /-N *1 

A sn 


Asn 


Gl y 


P ro 


T rp 


2878 GTA 


ACT 


CAC 


TGG 


GAA 


GAA 


ATG 


GCA 


AAG 


> TV 


ni \/o 1 


i n r 


Li : 0 
Ml S 


T rp 


ol U 


Cal U 


Me t 


A I a 


Ly s 


Ly s 


2908 ACG 


GCT 


ATT 


CGT 


CGC 


CTG 


TTC 


AAA 


TAT 


TTG 


zzu ~ i n r 


a I a 


1 1 G 


A rg 


A rg 


Leu 


Kn e 


Ly s 


Tyr 


Leu 


2938 CCC 


GTA 


TCA 


ATT 


GAG 


ATC 


CAG 


CGT 


GCA 


GTA 


ZjU" r ro 


va i 


be r 


1 I e 


CjI u 


l l e 




A rg 


A I a 


va I 


2968 TCA 


7\ rrv-« 


GAT 


GAA 


AAG 


GAA 


CCA 


CTG 


ACA 


ATC 


240^ Ser 


Met 


Asp 


Gl u 


Ly s 


Gl u 


P ro 


Leu 


Thr 


1 1 e 


2998 GAT 


CCT 


GCA 


GAT 


TCC 


TCT 


GTA 


TTA 


ACC 


GGG 


250^ Asp 


P ro 


Al a 


Asp 


Ser 


Ser 


Val 


Leu 


Thr 


Gl y 


3028 GAA 


TAG 


AGT 


GTA 


ATC 


GAT 


AAT 


TCA 


GAG 


GAA 


260^ Gl u 


Tyr 


Ser 


Val 


1 1 e 


Asp 


Asn 


Ser 


Gl u 


Gl u 


Bgl 


II 


Hindlll 















3058 TAG ATCTAA^GCTTCCTGCTGAACATCLAAAGGCAAGAAA 
270^ • • • 



3096 AC ATCTGTTGTCAAAGACAGCATCCTTGAACAAGGAaAA 

3135 TTAACAGTTAACAAATAAAAACGCAAAAGAAAATGCCGA. 

3 17 4 T ATC C T ATTGG C A.TTTTCTTTTATTTCTTATC AAC AT.AA 

Xhol 

3213 AGGTGAATCCCATACCTCGAG-CTTCACGCTGCCGCAAGC 

3252 AC TC AGGG C G C AAGGGC TG C T AAAAGGAAGC GG AAC ACG 

3291 T AG AAAGC CAG TC CGCAGAAAC GGTG C TG AC CCC GGATG 

3330 AA-TGTCA-GCTACTGGGCTATCTGGACAAGGGAAAACGCA 

33 69 AGCGCAAAGAGAAAGCAGGTA.GCTTGCAGTGGGCTTAC,^ 
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Figure 1 3b (cont'd) 

3408 TGGCGATAGCTAGACTGGGCGGTTTTATGGACAGCAAGC 

3447 GAACCGGAATTGCCAGCTGGGGCGCCCTCTGGTAAGGTT 

3486 GGGAAGCCCTGCAAAGTAAACTGGATGGCTTTCTTGCCG 

Bgll! 

3525 CC AAGGATCTGATGGCGC AGGGGATC AAGATCTGATC AA 

3564 GAGACAGGATGAGGATCGTTTCGC ATG GAT ATT 

l^Met Asp I I e 
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94^A rg 


Hi s 


Gl y 


Al a 


Ser 


Ly s 
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Ser 


1 1 e 


Thr 
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Figure 1 3b icor.t'd! 

3897. CGT GCG TTT GAT GAC GAT GTT GAG TTT CAG 
104^Arg Ala Phe Asp Asp Asp Va I Gl u Phe Gin 

3927 GAG CGC ATG GCA GAA CAC ATC CGG TAC ATG 
114»Glu A rg Met Ala Gl u His I I e A rg Tyr Met 

3957 GTT GAA ACC ATT GCT CAC CAC CAG GTT GAT 
124 ►Va I Gl u Thr Me Ala His His Gin: Val Asp 

Hindlll 

3987 ATT GAT TCA GAG GTA TAA AACGAGTAGA AGCT 
134 ► Me Asp Ser Gl u Va I • • • 

4019 TGGCTGTTTTGGCGGATGAGAGAAGATTT^ 

4058 ACAGATTAAATCAGAACGCAGAAGCGGTCTGATAAAACA 

4097 GAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGA 

413 6 CCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGA 

4175 TGGT AG TG TGGGGTC TC C C C ATGC GAGAGT AGGGAAC TG 

4214 CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACT 

4253. .GGGCCTTTCGTTTTATCTGTTGTTTGTC 

4292 TCCTGAGTAGGACAAATCCGCCGGGAGCGGATTTGAACG 

4331 TTGCGAAGCAACGGCCCGGAGGGTGGCGGGCAGGACGCC 

4370 CGCCATAAACTGCCAGGCATCAAATTAAGCAGAAGGC:CA 

4409 TCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTC'X'TT 

4487 ATGAGACAAT AAC C C TG ATAAA'TGC TTCAAT AAT ATTGA 

4526 AAAAGGAAGAGT ATG AGT ATT CAA CAT TTC 

l^Met Ser Me Gl n Hi s Phe 
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Figure 1 3b (cont'd) 

4556 CGT GTC GCC CTT ATT CCC TTT TTT GCG GCA 
7>A rg Va i Ala Leu Me Pro Phe Phe Ala Ala 

4586 TTT TGC CTT CCT GTT TTT GOT CAC CCA GAA 
17^Phe Cys Leu Pro Va I Phe Ala His Pro Gl u 

A ft 1 fi &CY~Z PTV" 1 A 7\ 7\ onTA 7V 7\ /-mm ^- 

-~ — - — j-w vj-l^ r-LcrLn. uin .f-iJ-v-i. i' (jtA'l' 

27^Thr Leu Va I Lys Va I Lys Asp Ala Gl u Asp 

4646 CAG TTG GGT GCA CGA GTG GGT TAC ATC GAA 
37>Gln Leu Gly Ala Arg Va I Gl y Tyr Me Gl u 

4676 CTG GAT CTC AAC AGC GGT AAG ATC CTT GAG 
47^Leu Asp Leu Asn Ser Gly Lys Me Leu Gl u 

4706 AGT TTT CGC CCC GAA GAA CGT TTT CCA ATG 
57^Ser Phe Arg Pro Gl u Gl u Arg Phe Pro Met 

473 6 ATG AGC ACT TTT AAA. GTT CTG CTA TGT GGC 
67^Met Ser Thr Phe Lys Va I Leu Leu Cys Gly 

4766 GCG GTA TTA TCC CGT GTT GAC GCC GGG CAA 
77>A.la Val Leu Ser Arg Va I Asp Ala Gly Gin 

4796 GAG CAA CTC. GGT CGC CGC ATA CAC TAT TCT 

87 ► Gl u Gin Leu Gl y A rg Arg I le -His Tyr Ser 

Seal " 

4826 CAG AAT GAC TTG GTT GAG TAC TCA CCA GTC 

97^Gln Asn Asp Leu Val Gl u Tyr Ser Pro Val 

4856 ACA GAA AAG CAT CTT ACG GAT GGC ATG ACA 
107 ► Thr Gl u Lys His Leu Thr Asp Gly Met Thr 

4886 GTA AGA GAA TTA TGC AGT GCT GCC ATA ACC 
117 ► Val Arg Gl u Leu Cys Ser Ala Ala Me Thr 

4916 ATG AGT GAT AAC ACT GCG GCC AAC TTA CTT 
127^Met Ser Asp Asn Thr Ala Ala Asn Leu Leu 
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Figure 1 3b (cont'd) 
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Figure 1 3b (cont'd) 

5336 ACT ATG GAT GAA CGA AAT AGA CAG ATC GCT 
267>Thr Met Asp Gl u A rg Asn A rg Gin lie Ala 

53 66 GAG ATA GGT GCC TCA CTG ATT AAG CAT TGG 
277^Glu Me G! y Ala Ser Leu lie Lys His T rp 

5396 TAA CTGTCAGACCAAGTTTACTCATATATACTTTAGAT 
287> • • • 

5434 TGATTTACGCGCCCTGTAGCGGCGCATTAAGCGCGGCGG 

5473 GTGTGGTGGTTACGCGCAGCGTGACCGCTACACTTGCCA 

5 5 12 GCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCT 

5551 . TTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAA 

5590 ATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCTTTAC 

5629 GGCACCTCGACCCCAAAAAACTTGATTTGGGTGATGGTT 

5668 CACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGCC 

5707 CTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCT 

5746 TGTTCCAAACTTGAACAACACTCAACCCTATCTCGGGCT 

578 5 ATTCTTTTGATTTATAAGGGATTTTGCCGATTTCGGC 

5824 ATTGGTTAAAAAATGAGCTGATTTAACAAAAATTTAACG 

5863 CGAATTTTAACAAAATATTAACGTTTACAATTTAAAAGG 

5902 ATCTAGGTGAAGATC CTTTTTGAT AATC TC ATGAC C AAA 

5941 ATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGAC 

5980 CCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCCTTTT 

6019 TTTCTGCGCGTAATCTGCTGCTTGCAAACAAAAAAACCA 

6058 CCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGAGCTA 

6097 CCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAGAGCG 
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Figure 1 3b tcor.t'd) 

6136 cagataccaaatactgtccttctagtgtagccgtagtta 

6175 ggc cac cac ttc aagaactc tg tagc ac cgcc t acat ac 

6214 ctcgctctgctaatcctgttaccagtggctgctgccagt 

6253 ggcgataagtcgtgtcttaccgggttggactcaagacga 

6292 tagttaccggataaggcgcagcggtcgggctgaacgggg 

6331 ggttcgtgcacacagcccagcttggagcgaacgacctac 

6370 ac cgaactgagatac ctacagcgtgagctatgagaaagc 

6409 gccacgcttcccgaagggagaaaggcggacaggtatccg 

6448 gtaagcggc agggtc gg aac aggagagc gcac gagggag 

6487 cttccagggggaaacgcctggtatctttatagtcctgtc 

6526 gggtttcgc c acctctgac ttg agc gtcg atttttgtga 

6565 tgctcgtcaggggggcggagc ctatggaaaaacgc cagc 

6604 a^.cc<:ggcctttttacggttcctggccttttgctggcct 

6643 tttgctcacarrcttctttcctgcgttatcccctgattct- 

6682 gtggataac cgtattac cgc ctttgagtgagctgatacc 

6721 gctcgccgcagccgaacgaccgagcgcagcgagtcagtg 

6760 agcgaggaagcggaagagcgc ctgatgcggtattttctc 

6799 CTTACGCATCTGTGCGGTATTTCACACCGCATAGGGTCA 

6838 TGGCTGCGCCCCGACACCCGCCAACACCCGCTGACGCGC 

6877 CCTGACGGGC TTGTC TGCTCCCGGC ATC CGCTTAC AGAC 

6916 AAGCTGTGACCGTCTCCGGGAGCTGCA.TGTGTCAGAGGT 

6955 TTTCAC CGTC ATC AC CGAAAC GCGC GAGGC AGC AAGGAG 
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Figure 1 3b (cont'd) 

6994 ATGGCGCCCAACAGTCCCCCGGCCACGGGGCCTGCCACC 
7033 ATACCCACGCCGAAACAAGCGCTCATGAGCCCGAAGTGG 
7072 CGAGCCCGATCTTCCCCATCGGTGATGTCGGCGATATAG 
7111 GCGCCAGCAACCGCACCTGTGGCGCCQGTGATGCCGGCC 
7150 ACGATGCGTCCGGCGTAGAGGATCTGCTCATGTTTGACA 
7189 GCTTATC 
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Figure 1 4b 

Nsil 

1 ATCGATGCATAA'TGTGCCTGTCAAATGGACGAAGCAGGG 
4 0 ATTCTGCAAACCCTATGCTACTCCGTCAAGCCGTCAATT 



79 G TCTGATTC GTTAC C AA TTA TGA CAA CTT GAG 

o no i . - - i A . . i..- \ /_ i 

^z? ^> ^ - - - ot? I L- u L_y b Veil 
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Figure t 4b (cont'd) 
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Figure 1 4b (cont'd) 



801 TIT 


GAP 


CAP 




r v TG 












58< Lys 


Va 1 


Va 1 


Gl y 


Gl n 


Gl y 


A rg 


1 1 e 


Thr 


Leu 




GAG 


AAT 






ill 




1 \— ^ 


CAG 




48< Asn 


Leu 


1 1 e 


Tyr 


Gl y 


Ly s 


Met 


Gl v 


Leu 


P ro 


861 TCG 


GTC 


GAT 


AAA 


AAA 


ATC 


GAG 


ATA 


ACC 


GTT 1 


38<A rg 


Asp 


1 1 e 


Phe 


Phe 


Asp 


Leu 


Tyr 


Gl y 


Asn 


891 GGC 


CTC 


AAT 


CGG 


CGT 


TAA 


ACC 


CGC 


CAC 


CAG 


28<AI a 


Gl u 


1 1 e 


P ro 


Th r 

III! 


1 P 1 I 


Gl v 

V — i 1 y 


A 1 a 


Val 


Leu 


921 ATG 


GGC 


ATT 


AAA 


CGA 


GTA 


TCC 


CGG 


CAG 


CAG 


18« Hi s 


Al a 


Asn 


Phe 


Ser 


Tyr 


Gl y 


P ro 


Leu 


Leu 


951 GGG 


ATC 


ATT 


TTG 


CGC 


TTC 


AGC 


CAT 


ACTTTTC 


8< Pro 


Asp 


Asn' 


Gl n 


Al a 


Gl u 


Al a 


Met 







982 ATACTCCCGCCATTCAGAGAAGAAACCAATTGTCCATAT 



1021 TGCATCAGACATTGCCGTCACTGCGTCTTTTACTGGCTC 

1060 TTCTCGCTAACCAAACCGGTAACCCCGCTTATTAAAAGC 

1099 ATTCTGT AAC AAAGC GGGAC C AAAGC CATGAC AAAAAC G 

1138 CGTAACAAAAGTGTCTATAATCACGGCAGAAAAGTCCAC 

1177 ATTGATTATTTCCACGGCGTCACAC 

BamHI 

12 1 6 GCATTTTTATCCATAAGATTAGCGGATCCTACCTGACGC 

1255 TTTTTATCGCAACTCTCTACTGTTTCTCCATACCCGTTT 

Nhel EcoRI 

1294 TTTTGGGCTAGCAGGAGGAATTCACC ATG ACA CCG 

l^Met Thr Pro 

Pstl 

1329 GAC ATT ATC CTG CAG CGT ACC GGG ATC GAT 
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va i 


I I e 


A I a 


Ly s 


1449 CCC 


CGC 


TCC 


GGA 


AAG 


AAG 


TGG 


CCT 


GAC 


ATG 


44 ►Pro 


A rg 


Ser 


Gl y 


Ly s 


Ly s 


Trp 


P ro 


Asp 


Met 


1479 AAA 


ATG 


TCC 


TAC 


TTC 


CAC 


ACC 


CTG 


CTT 


GCT 


54 ► Lys 


Met 


Ser 


Tyr 


Phe 


Hi s 


Thr 


Leu 


Leu 


Al a 


1509 GAG 


GTT 


TGC 


ACC 


GGT 


GTG 


GCT 


CCG 


GAA 


GTT 


64 ► Gl u 


Val 


Cys 


Thr 


Gl y 


Va 1 


Al a 


Pro 


Gl u 


Val 


1539 AAC 


GCT 


AAA 


GGA 


CTG 


GCC 


TGG 


GGA 


AAA 


CAG 


74^ Asn 


Al a 


Ly s 


Al a 


Leu 


Al a 


Trp 


Gl y 


Ly s 


Gl n 




















EcoRI 


1569 TAC 


GAG 


AAC 


GAC 


GCC 


AGA 


ACC 


CTG 


TIT 


GAA 


84 ► Ty r 


Gl u 


A 

Asn 


Asp 


A 1 

A 1 a 


A rg 


T h-r 


Le u 


Kn e 


Ol i i 

Lai U 


1599 TTC 


ACT 


TCC 


GGC 


GTG 


AAT 


GTT 


ACT 


GAA 


TCC 


94> Phe 


Thr 


Ser 


Gl y 


Val 


Asn 


Val 


Thr 


Gl u 


Ser 


1629 CCG 


ATC 


ATC 


TAT 


CGC 




GAA 


AGT 


ATG 


CGT 


104 ► Pro 


1 1 e 


1 1 e 


Tyr 


A rg 


Asp 


Gl u 


Ser 


Met 


A rg 


1659 ACC 




TGC 


TCT 


CCC 


GAT 


GGT 


TTA 


TGC 


AGT 


114 ►Thr 


Al a 


Cys 


Ser 


P ro 


Asp 


Gl y 


Leu 


Cys 


Ser 


1689 GAC 


GGC 


AAC 


GGC 


CTT 


GAA 


CTG 


AAA 


TGC 


CCG 


124^ Asp 


Gl y 


Asn 


Gl y 


Leu 


Gl u 


Leu 


Ly s 


Cys 


Pro 
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Figure 1 4b (cont'd) 



1719 TIT 


ACC 


TCC 


CGG 


GAT 


TTC 


ATG 


AAG 


TTC 


CGG 


134 ► Phe 


Thr 


Ser 


A rg 


Asp 


Phe 


Met 


Ly s 


Phe 


A rg 


1749 CTC 


GGT 


GGT 


TTC 


GAG 


GCC 


ATA 


AAG 


TCA 


GCT 


144 ► Leu 


ni v 
— j 


OA w 
^" j 




r^i 1 1 


A 1 a 


1 1 o 

1 1 w 


1 \/ c 


Qo r 


A 1 a 


1779 TAC 


ATG 


GCC 


CAG 


GTG 


CAG 


TAC 


AGC 


ATG 


TGG 


154 ► Ty r 


Met 


Al a 


Gl n 


Val 


Gl n 


Tyr 


Ser 


Met 


Trp 


1809 GTG 


ACG 


CGA 


AAA 


AAT 


GCC 


TGG 


TAC 


TIT 


GCC 


164 ► Val 


Thr 


A rg 


Ly s 


Asn 


Al a 


Trp 


Tyr 


Phe 


Al a 


1839 AAC 


TAT 


GAC 


CCG 


CGT 


ATG 


AAG 


CGT 


GAA 


GGC 


174^Asn 


Tyr 


Asp 


P ro 


A rg 


Met 


Ly s 


A rg 


Gl u 


Gl y 


1869 CTG 


CAT 


TAT 


GTC 


GTG 


AIT 


GAG 


CGG 


GAT 


GAA 


184 ► Leu 


Hi s 


Tyr 


Val 


Val 


1 1 e 


Gl u 


A rg 


Asp 


Gl u 


1899 AAG 


TAC 


ATG 


GCG 


AGT 


TIT 


GAC 


GAG 


ATC 


GTG 


194 ► Lys 


Tyr 


Met 


Al a 


Ser 


Phe 


Asp 


Gl u 


1 1 e 


Val 


1929 CCG 


GAG 




ATC 


GAA 


AAA 


ATG 


GAC 


GAG 


GCA 


204^ Pro 


Gl u 


Phe 


1 1 e 


Gl u 


Ly s 


Met 


Asp 


Gl u 


Al a 


1959 CTG 


GCT 


GAA 


ATT 


GGT 


TIT 


GTA 


TIT 


GGG 


GAG 


214 ► Leu 


Al a 


Gl □ 


1 1 e 


Gl y 


Phe 


Val 


Phe 


Gl y 


Gl u 



Kpnl 

1989 CAA TGG CGA TAGATCCGGTACCCGAGCACGTGTTGA 
224^ Gin Trp Arg 



2025 CAATTAATCATCGGCATAGTATATCGGCATAGTATAATA 

2064 CGACAAGGTGAGGAACTAAACC ATG AGT ACT GCA 

!► Met Ser Thr Ala 

2098 CTC GCA ACG CTG GCT GGG AAG CTG GCT GAA 
5^ Leu Ala Thr Leu Ala Gl y Lys Leu Ala Gl u 



.99 29837 A2 J 
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Figure 1 4b (cont'd) 















Sail 








2128 CGT 


GTC 


GGC 


ATG 


GAT 


TCT 


GTC 


GAC 


CCA 


CAG 


15 ► A rg 


Val 


Gl y 


Met 


Asp 


Ser 


Val 


Asp 


P ro 


Gl n 


2158 GAA 


CTG 


ATC 


ACC 


ACT 


CTT 


CGC 


CAG 


ACG 




25 > Gl u 


Leu 


1 1 e 


Thr 


Thr 


Leu 


A rg 


Gl n 


Thr 


Al a 


2188 TIT 


AAA 


- GGT 


GAT 


GCC 


AGC 


GAT 


GCG 


CAG 


TTC 


35 ► Phe 


Ly s 


Gl y 


Asp 


Al a 


Ser 


Asp 


Al a 


Gl n 


Phe 


2218 ATC 


GCA 


TTA 


CTG 


ATC 


GTT 


GCC 


AAC 


CAG 


TAC 


45^ 1 1 e 


Al a 


Leu 


Leu 


1 1 e 


Va 1 


Al a 


Asn 


Gl n 


Tyr 


2248 GGC 


err 


AAT 


CCG 


TGG 


ACG 


AAA 


GAA 


ATT 


TAC 


55 ► Gl y 


Leu 


Asn 


P ro 


Trp 


Thr 


Ly s 


Gl u 


1 1 e 


Tyr 


2278 GCC 


TIT 


CCT 


GAT 


AAG 


CAG 


AAT 


GGC 


ATC 


GTT 


65 ► Al a 


Phe 


P ro 


Asp 


Ly s 


Gl n 


Asn 


Gl y 


1 1 e 


Val 


2308 CCG 


GTG 


GTG 


GGC 


GTT 


GAT 


GGC 


TGG 


TCC 


CGC 


15> Pro 


Val 


Val 


Gl y 


Va 1 


Asp 


Gl y 


Trp 


Ser 


A rg 


2338 ATC 


ATC 


AAT 


GAA 


AAC 


CAG 


CAG 


TIT 


GAT 


GGC 


85 ► Me 


1 1 e 


Asn 


Gl u 


Asn 


Gl n 


Gl n 


Phe 


Asp 


Gl y 


2368 ATG 


GAC 


TIT 


GAG 


CAG 


GAC 


AAT 


GAA 


TCC 


TGT 


95^Met 


Asp 


Phe 


Gl u 


Gl n 


Asp 


Asn 


Gl u 


Ser 


Cys 


2398 ACA 


TGC 


CGG 


ATT 


TAC 


CGC 


AAG 


GAC 


CGT 


AAT 


105 ► Thr 


Cys 


A rg 


1 1 e 


Tyr 


A rg 


Ly s 


Asp 


A rg 


Asn 


2428 CAT 


CCG 


ATC 


TGC 


GTT 


ACC 


GAA 


TGG 


ATG 


GAT 


115> Hi s 


Pro 


1 1 e 


Cys 


Val 


Thr 


Gl u 


Trp 


Met 


Asp 


2458 GAA 


TGC 


CGC 


CGC 


GAA 


CCA 


TTC 


AAA 


ACT 


CGC 


125* Gl u 


Cys 


A rg 


A rg 


Gl u 


P ro 


Phe 


Ly s 


Thr 


A rg 


2488 GAA 


GGC 


AGA 


GAA 


ATC 


ACG 




CCG 


TGG 


CAG 


135> Gl u 


Gl y 


A rg 


Gl u 


1 1 e 


Thr 


Gl y 


P ro 


Trp 


Gl n 
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Figure 1 4b (cont'd) 

2518 TCG CAT CCC AAA CGG ATG TTA CGT CAT AAA 
145^Ser His Pro Lys Arg Met Leu A rg His Lys 

2548 GCC ATG ATT CAG TGT GCC CGT CTG GCC TTC 
155> Al a Met I I e fil n n A \ a Am I dm A I o Dh q 

2578 GGA TTT GCT GGT ATC TAT GAC AAG GAT GAA 
165 ► Gl y Phe Ala Gl y lie Ty r Asp Lys Asp Gl u 

2608 GCC GAG CGC ATT GTC GAA AAT ACT GCA TAC 

175^Ala Gl u Arg Me Va I Gl u Asn Thr Ala Tyr 
Pstl 

2638 ACT GCA GAA CGT CAG CCG GAA CGC GAC ATC 

185^ Thr Ala Gl u A rg Gl n P ro Gl u A rg Asp I I e 

2668 ACT CCG GTT AAC GAT GAA ACC ATG CAG GAG 
195>Thr Pro Val Asn Asp Gl u Thr Met Gin Gl u 

2698 ATT AAC ACT CTG CTG ATC GCC CTG GAT AAA 
205Mle Asn Thr Leu Leu Me Ala Leu Asp Lys 

2728 ACA TGG GAT GAC GAC TTA TTG CCG CTC TGT 
215 ►Thr Trp Asp Asp Asp Leu Leu Pro Leu Cys 

2758 TCC CAG ATA TTT CGC CGC GAC ATT CGT GCA 
225^Ser Gin Me Phe Arg Arg Asp lie Arg Ala 

2788 TCG TCA GAA CTG ACA CAG GCC GAA GCA GTA 
235^Ser Ser Gl u Leu Thr Gin Ala Gl u Al a Val 

2818 AAA GCT CTT GGA TTC CTG AAA CAG AAA GCC 
245^ Lys Ala Leu Gl y Phe Leu. Lys Gin Lys Ala 

Bglll Xhol 

2848 GCA GAG CAG AAG GTG GCA GCA TAGATCTC GAG 
255^Ala Gl u Gin Lys Val Ala Ala ••• 
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Figure 1 4b (cont'd) 

Hindlll 

2880 AAGC TTC C TG C TG AACATC AAAGGC AAG AAAACATC TGT 

2919 TGTCAAAa^CAGCATCCTTGAACAAGGACAATTAACAGT 

2958 TAACAAATAAAAACGCLZU^^GAAAATGCCGATATCCTAT 

2997 TGGCATTTTCTTTTATTTCTTATCAACATAAA.GGTG 

Xhol 

3036 CCCATACCTCGAGCTTCACGCTGCCGCAAGCACTCAGGG 

3075 CGCAAGGGCTGCTAAAAGGAAGCGGAACACGTAGAAAGC 

3 114 CAGTCCGCAGAAACGGTGCTGACCCCGGATGAATGTCAG 

3153 CTACTGGGCTATCTGGACAAGGGAAAACGCA^GCGCAAA 

3192 GAGAAAGGAGGTAGCTTGCAGTGGGCTTACATGGCGATA 

3231 GCTAGACTGGGCGGTTTTATGGAC AGCAAGCGAAC CGGA 

Pvull 

3270 ATTGCCAGCTGGGGCGCCCTCTGGTAAGGTTGGGAAGCC 

3309 CTGCAA^GTAAACTGGATGGCTTTCTTGCCGCC AAGGAT 

Bglll 

3348 CTGATGGCGCAGGGGATCAAGATCTGATCA^GAGACAGG 

3387 ATGAGGATCGTTTCGC ATG GAT ATT AAT ACT 

l^Met Asp lie Asn Thr 

3418 GAA ACT GAG ATC AAG CAA AAG CAT TCA CTA 
6^Glu Thr Gl u Me Lys Gin Lys His Ser Leu 

3448 ACC CCC TTT CCT GTT TTC CTA ATC AGC CCG 
16^Thr Pro Phe Pro Va I Phe Leu lie Ser Pro 

3478 GCA TTT CGC GGG CGA TAT TTT CAC AGC TAT 
26^Ala Phe A rg Gl y A rg Tyr Phe His Ser Tyr 

3508 TTC AGG AGT TCA GCC ATG AAC GCT TAT TAC 
36^Phe A rg Ser Ser Ala Met Asn Ala Tyr Tyr 
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Figure 1 4b (cont'd) 

3538 ATT CAG GAT .CGT CTT GAG GCT CAG AGO TGG 
46Mle Gin Asp A rg Leu Gl u Ala Gin Ser Trp 

3568 GCG CGT CAC TAC CAG CAG CTC GCC CGT GAA 
56^Ala Arg His Tyr Gin Gin Leu Ala A rg Gl u 

3598 GAG AAA GAG GCA GAA CTG GCA GAC GAC ATG 
66> Gl u Lys Gl u Ala Gl u Leu Ala Asp Asp Met 

3628 GAA AAA GGC CTG CCC CAG CAC CTG TTT GAA 
76^Glu Lys G! y Leu Pro Gin His Leu Phe Gl u 

3658 TCG CTA TGC ATC GAT CAT TTG CAA CGC CAC 
86^Ser Leu Cys lie Asp His Leu Gin Arg His 

3688 GGG GCC AGC AAA AAA TCC ATT ACC CGT GCG 
96^ Gl y Ala Ser Lys Lys Ser Me Thr A rg Ala 

3718 TTT GAT GAC GAT GTT GAG TTT CAG GAG CGC 
106> Phe Asp Asp Asp Va I Gl u Phe Gin Gl u A rg 

3748 ATG GCA GAA CAC ATC CGG TAC ATG GTT GAA 
116 ► Met Ala Gl u His Me Arg Tyr Met Va I Gl u 

3778 ACC ATT GCT CAC CAC CAG GTT GAT ATT GAT 
126 ► Thr lie Ala His His Gl n Va I Asp I I e Asp 

Hindlll 

3808 TCA GAG GTA TAA AACGAGTAGA AGC TTG GCT 
13 6 ► Se r Gl u Va I 

3839 GTT TTG GCG GAT GAG AGA AGA TTT TCA GCC 

3869 TGA TACAGATTAAATCAGAACGCAGAAGCGGTCTGATA 

3907 AAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCA 

3946 CCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGC 

3985 GCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGG 
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Figure 1 4b (cont'd) 

4024 AACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTCGAA 

4063 AGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTG 

4102 CGCTCTCCTGAGTAGGACAAATCCGCCGGGAGCGGATTT 

4141 GAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGGGCAGG 

4180 ACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAA 

42 19 GGCCATCCTGACGGATGGCCTTTTTGGGTTTCTACAAAC 

4258 TCTTTTGTTTATTTTTCTAAATACATTCAAA 

4297 CGCTCATG AGAC AATAACCCTGATAAATGCTTC AATAAT 

4336 ATTGAAAAAGGAAGAGT ATG AGT ATT CAA CAT 

l^Met Ser Me Gl n Hi s 



4368 TTC 


CGT 


GTC 


GCC 


CTT 


ATT 


CCC 


TIT 


'ITT 


GCG 


6> Phe 


A rg 


Val 


Al a 


Leu 


1 1 e 


Pro 


Phe 


Phe 


Al a 


4398 GCA 


'i'l'l' 


TGC 


CTT 


CCT 


GIT 


'i'l'l' 


GCT 


CAC 


CCA 


16 ► Al a 


Phe 


Cys 


Leu 


P ro 


Val 


Phe 


Al a 


Hi s 


P ro 


4428 GAA 


ACG 


CTG 


GTG 


AAA 


GTA 


AAA 


GAT 


GCT 


GAA 


26 ► Gl u 


Thr 


Leu 


Val 


Ly s 


Val 


Ly s 


Asp 


Al a 


Gl u 


4458 GAT 


CAG 


TTG. 


GGT 


GCA 


CGA 


GTG 


GGT 


TAC 


ATC 


36^Asp 


Gl n 


Leu 


Gl y 


Al a 


A rg 


Val 


Gl y 


Tyr 


1 1 e 


4488 GAA 


CTG 


GAT 


CTC 


AAC 


AGC 


GGT 


AAG 


ATC 


CTT 


46^ Gl u 


Leu 


Asp 


Leu 


Asn 


Ser 


Gl y 


Ly s 


1 1 e 


Leu 


4518 GAG 


AGT 


'i'l'l' 


CGC 


CCC 


GAA 


GAA 


CGT 


'ITT' 


CCA 


56^Glu 


Ser 


Phe 


A rg 


P ro 


Gl u 


Gl u 


A rg 


Phe 


Pro 


4548 ATG 


ATG 


AGC 


ACT 






GTT 


CTG 


CTA 


TGT 


66 ►Met 


Met 


Ser 


Thr 


Phe 


Ly s 


Val 


Leu 


Leu 


Cys 
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Figure 14b (cont'd) 



4578 GGC 


GCG 


GTA 


TTA 


TCC 


CGT 


GTT 


GAC 


GCC 


GGG 


16> GI y 


Al a 


Val 


Leu 


Ser 


A rg 


Val 


Asp 


Al a 


GI y 


4608 CAA 


GAG 


CAA 


CTC 


GGT 


CGC 


CGC 


ATA 




TAT 


86^ GI n 


GI u 


GI n 


Leu 


G! y 


A rg 


A rg 


! 1 e 


Hi s 


Tyr 














Scai 






40 JO X v_ J- 


far 


A AT 


par' 


MM p 




gag 


TAC 


TP A 


pp A 


96^ Ser 


GI n 


Asn 


Asp 


Leu 


Val 


GI u 


Tyr 


Ser 


P ro 




Ar'A 




A Af 1 




LI i 


ACG 


GAT 


PPP 


ATP 


106 ► Val 


Thr 


GI u 


Lv s 


Hi s 


Leu 


Thr 


Asp 


GI y 


Me t 


f± O -7 O -riv — rt. 




Af^ A 




TTA 
X 1A 




AGT 


GCT 


ppp 


A TA 


116>Thr 


Val 


A rq 


GI u 


Leu 


Cys 


Ser 


Al a 


Al a 


1 1 e 


A79P Apr 


atp 


APT 


PAT 




APT 


GCG 


GCC 


A AP 


TTA 


126^Thr 


Met 


Ser 


Asp 


Asn 


Thr 


Al a 


Al a 


Asn 


Leu 


*s / JO Lll 


\- IVjj 


AHA 




7\rpp 


Gp A 


GGA 


CCG 


a ap 


PA p 


136> Leu 


Leu 


Thr 


Thr 


1 1 e 


GI y 


GI y 


P ro 


Ly s 


GI u 


47qq r^TZv 






J 1 V I V I 1 

XXX 


TTP 


PAP 


AAC 


ATG 




PAT 


146^ Leu 


Thr 


Al a 


Phe 


Leu 


Hi s 


Asn 


Met 


GI y 


Asp 


AQ1 ft fAT 1 




-TVw X 


prip 




PAT 


CGT 


TGG 


PA A 




15S> Hi s 


Val 


Thr- 


A rq 


Leu 


Asp 


Arg 


Trp 


GI U 


Pro 




PTTZ 

xVj 


AAT 


pa A 


ppp 


ATA 


CCA 


AAC 


PAP 


PAP 


166^ GI u 


Leu 


Asn 


GI u 


Al a 


1 1 e 


P ro 


Asn 


Asp 


GI u 


4878 CGT 


GAG 


ACG 


AGG 


ATG 


PPT 


GTA 


GCA 


ATG 


^ — * /*— * -*\ 


176 ► A rn 


A cn 


Th r 

111) 


Th r 

1 1 1 1 


1 VIC I 


P rn 


Val 


Al a 


Me t 


Al a 


4908 ACA 


ACG 


TTG 


CGC 


AA ^ 


CTA 


TTA 


ACT 


GGC 


GAA 


186^ Thr 


Thr 


Leu 


A rg 


Ly s 


Leu 


Leu 


Thr 


GI y 


GI u 


4938 CTA 


CTT 


ACT 


CTA 


GCT 


TCC 


CGG 


CAA 


CAA 


TTA 


196^ Leu 


Leu 


Thr 


Leu 


Al a 


Ser 


A rg 


GI n 


GI n 


Leu 
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Figure 1 4b (cont'd) 





GAG 


IGG 


"A TV" 1 


GAG 




GAT 


^ "> 


GTT 


GCA 


206^ 1 1 e 


A S D 


T ro 


Me t 


Gl u 


A i a 


A <^ n 


I \/ ^ 

l— y o 


Va 1 


A 1 a 


4 y y O GGuri. 


rv^ a 

GGJ-l 


v_ I I 


G 1G 


GG*_ 




GGG 


LI i 


LLb 


GL I 


216^ Gl y 


P ro 


Leu 


Leu 


A ra 


Se r 


Al a 


Leu 


P ro 


Al a 






ill 




1 




AAA 


Is L 






22 6 ► Gl y 


Trp 


Phe 


1 1 e 


A 1 a 


Asp 


Lv s 


Ser 


Gl y 


Al a 


jUjo Kj\J X 


GriG 


GG 1 




1 \_ I 


GGG 






A ' 1 * M 




236> Gl y 


Gl u 


A rq 


Gl y 


Ser 


A ra 


Gl y 


1 1 e 


1 1 e 


Al a 


jUoo GG-rt. 










GG1 


HAG 






Lbl 


246^ Al a 


Leu 


Gl v 


P ro 


Asp 


Gl v 


Lv s 


P ro 


Ser 


A ra 


5118 ATC 


GTA 


GTT 


ATC 


TAC 


ACG 


ACG 


GGG 


AGT 


CAG 


256> 1 1 e 


Val 


Val 


1 1 e 


Tyr 


Thr 


Thr 


Gl y 


Ser 


Gl n 


5148 GCA 


ACT 


ATG 


GAT 


GAA 


CGA 


AAT 


AGA 


CAG 


ATC 


266^AI a 


Thr 


Met 


Asp 


Gl u 


A rg 


Asn 


A rg 


Gl n 


1 1 e 


5178 GCT 


GAG 


ATA 


GGT 


GCC 


TCA 


CTG 


ATT 


AAG 


CAT 


276^ Al a 


Gl u 


1 1 e 


Gl y 


Al a 


Ser 


Leu 


1 1 e 


Ly s 


Hi s 



5208 TGG TAA CTGTCAGACCAAGTTTACTCATATATACTTT 
286> T rp • • • 

5245 AGATTGATTTA'CGCGCCCTGTA.GCGGCGCATTAAGCGCG 

5284 GCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACACTT 

5323 GCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCT 

5362 TCCTTTCTCGCCACGTTCGCCGGCTTTCCCCGTCAAGCT 

5401 CTAAATCGGGGGCTCCCTTTAGGGTTCCGATTTAGTGCT 

5440 TTACGGCACCTCGACCCC AAAAAACTTGATTTGGGTG AT 

5479 GGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTT v 
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Figure 1 4b (cont'd) 

5518 CGCCCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGA 
5557 CTCTTGTTCCAAACTTGAACAACACTCAACCCTATCTCG 
5596 GGCTATTCTTTTGATTTATAAGGGATTTTGCCGATTTCG 
5635 GCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAATTT 
5674 AACGCGAATTTTAACAAAATATTAACGTTTACAATTTAA 
5713 AAGGATCTAGGTGAAGATCCTTTTTGATAATCTCATGAC 
5752 CAAAATCCCTTAACGTGAGTTTTCGTTCCACTGAGCGTC 
5791 AGACCCCGTAGAAAAGATCAAAGGATCTTCTTGAGATCC 
5830 TTTTTTTCTGCGCGTAATCTGCTGCTTGCAAAC 
58 69 ACCACCGCTACCAGCGGTGGTTTGTTTGCCGGATCAAGA 
5908 GCTACCAACTCTTTTTCCGAAGGTAACTGGCTTCAGCAG 
5947 AGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTA 
5986 GTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTAC 
6025 ATACCTCGCTC.TGCTAATCCTGTTACCAGTGGCTGCTGC 
6064 CAGTGGCGATAAGTCGTGTCTTACCGGGTTGGACTCAAG 
6103 ACGATAGTTACCGGATAAGGCGCAGCGGTCGGGCTGAAC 
6142 GGGGGGTTCGTGC ACAC AGC C CAGCTTGGAGCGAACGAC 
6181 CTACACCGAACTGAGATACCTACAGCGTGAGCTATGAGA 
6220 AAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGGTA 
6259 TCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAG 
6298 GGAGCTTCCAGGGGGAA^CGCCTGGTATCTTTATAGTCC 
6337 TGTCGGK5TTTCGCCACCTCTGACTTGAGCGTCGATTTTT 
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Figure 1 4b (cont'd) 

6376 GTGATGCTCGTCAGGGGGGCGGAGCCTATGGAAAAACGC 

6415 CAGCAACGCGGCCTTTTTACGGTTCCTGGCCTTTTGCTG 

6454 GCCTTTTGCTC ACATGTTCTTTCCTGCGTTATCCCCTGA 

6493 TTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTGA 

6532 TACCGCTCGCCGCAGCGGAACGACCGAGCGCAGCGAGTC 

6571 AGTGAGCGAGGAAGCGGAAGAGCGCCTGATGGGGTATTT 

6610 TCTCCTTACGC ATCTGTGCGGTATTTCACACCGCATAGG 

6649 GTCATGGCTGCGCCCCGACACCCGCCAACACCCGCTGAC 

6688 GCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTAC 

6727 AGACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAG 

6766 AGGTTTTCACCGTCATCACCGAAACGCGCGAGGCAGCAA 

6805 GGAGATGGCGCCCAACAGTCCCCCGGCCACGGGGCCTGC 

6844 CACCATACCC ACGCCGAAACAAGCGCTCATGAGCCCGAA 

6883 GTGGCGAGCCCGATCTTCCCCATCGGTGATGTCGGCGAT 

6922 ATAGGCGCCAGCAACCGCACCTGTGGCGCCGGTGATGCC 

6961 GGCCACGATGCGTCCGGCGTAGAGGATCTGCTCATGTTT 

7000 GACAGCTTATC 



SUBSTITUTE SHEET (RULE 26) 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 



( EM3L) 



(EPO) 



. (A) NAME: European Molecular Biology Laboratory 

(3; STREET : Meyerhof st rasse 1 
(C) CITY: Heidelberg 
(E) COUNTRY : DE 

(r : POSTAL CODS (ZIP) : D-S9117 

(ii) TITLE OF INVENTION: Novel DNA Cloning Method 

(iii) NUMBER OF SEQUENCES : 14 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Flopoy disk 
(3) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE.: Patent In Release #1.0, Version #1.30 



(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: EP 97121562.2 
(3) FILING DATE: 05-DEC-1997 

(vi) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: EP 98118756.0 

(B) FILING DATE : 05-OCT-1998 



(2) INFORMATION FOR SEQ ID NO: 1: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6150 base oairs 

(B) TYPE: nucleic acid' 

(C) STRANDEDNESS: both 

(D) TOPOLOGY: both 



(vii) IMMEDIATE SOURCE: 

(B; CLONE : p3AD2 4 -recET 

(ix) FEATURE : 

( A ) NAME / KEY : miscjeacure 

(B) LOCATION : complement ( 96 . . 974 ) 

(D) OTHER INFORMATION: /product = "araC" 

(ix) FEATURE: 

(A) NAME /KEY: misc_f eacure 

(B) LOCATION: 13 2 0 2162 

(D; OTHER INFORMATION : /croduct= "c-recE' 



8NSOOCIO: <WO 9929807 A2_ 
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(ix) FEATURE: 

(A) NAME/ KEY : misc_f eacure 
(3) LOCATION : 2155 . .2972 

(D) OTHER INFORMATION: /product = "recT" 

( 1 x ) FEATURE: 

(A) NAME / KEY : misc_f eature 
(5? LOCATION: 3493. .4353 

(D; OTHER INFORMATION: /produce = "bia" 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 1 : 

AT C G AT GC AT AATGTGCCTG TCAAATGGAC GAAGCAGGGA TTCTGCAAAC CCTATGCTAC 6 0 

TCCGTCAAGC CGTCAATTGT CTGATTCGTT AC C AAT T ATG ACAACTTGAC GGCTACATCA 120 

TTCACTTTTT CTTCACAACC GGCACGGAAC TCGCTCGGGC TGGCCCCGGT GCATTTTTTA 180 

AATACCCGCG AG AAA TAG AG TTGATCGTCA AAACCAACAT TGCGACCGAC GGTGGC GAT A 24 0 

GGCATCCGGG TGGTGCTCAA AAGCAG CTTC GCCTGGCTGA TACGTTGGTC CTCGCGGCAG 3 00 

CTTAAGACGC TAATCCCTAA CTGCTGGCGG AAAAGATGTG ACAGACGCGA CGGCGACAAG 350 

CAAACA.TGCT GTGCGACGCT GGCGATATCA AAATTGCTGT CTGCCAGGTG ATCGCTGATG 4 20 

TACTGACAAG CCTCGCGTAC CCGATTATCC ATCGGTGGAT GGAGCGACTC GTTAATCGCT 4 80 

TCCATGCGCC GCAGTAACAA TTGCTCAAGC AGATTTATCG CCAGCAGCTC CGAATAGCGC 54 0 

CCTTCCCCTT GCCCGGCGTT AATGATTTGC CCAAACAGGT CGCTGAAATG CGGCTGGTGC 5 00 

GCTTCATCCG GGCGAAAGAA CCCCGTATTG GCAAATATTG ACGGCCAGTT AAGCCATTCA 550 

TG C C AG TAG G CGCGCGGACG AAAGTAAACC CACTGGTGAT ACCATTCGCG AG CCZCC G G A ~ 2 C 

TGACGACCGT AGTGATGAAT CTCTCCTGGC GGGAACAGCA AAATATCACC CGGTCGGCAA ~~ 5 0 

ACAAATTCTC GTCCCTG ATT TTTCACCACC CCCTG^CCGC G AATGG TG AG ATTGAGAATA 34 0 

TAACCTTTCA TTCCCAGCGG TCGGTCGATA AAAAAATCGA GATAACCGTT GGCCTCAATC 90 0 

GGCGTTAAAC CCGCCACCAG ATGGG C ATT A AACGAGTATC CCGGCAGCAG GGGATCATTT 95 0 

TGCGCTTCAG CCATACTTTT CATACTCCCG CCATTCAGAG AAGAAACCAA TTGTCCATAT 1020 

TGCATCAGAC ATTGCCGTCA CTGCGTCTTT TACTGGCTCT TCTCGCTAAC CAAACCGGTA 10 30 

ACCCCGCTTA TTAAAAGCAT TCTGTAACAA AGCGGGACCA AAGCCATGAC AAAAACGCGT 114 0 

AAC AAAAGTG TCTATAATCA CGGCAGAAAA GTCCACAT.TG ATTATT7GCA CGGCGTCACA 12 CO 

CTTTGCTATG C CAT AG C ATT TTTATCCATA AGATTAGCGG ATCCTACCTG ACGCTTTTTA 12 5 0 

TCGCAACTCT CTACTGTTTC TCCATACCCG TTTTTTTGGG CTAGCAGGAG GAATTCACCA 13 2 0 

TGGATCCCGT AATCGTAGAA GACATAGAGC CAGGTATTTA TTACGGAATT TCGAATGAGA 13 30 

ATTACCACGC GGGTCCCGGT ATCAGTA^GT CTCAGCTCGA TGACATTGCT GATACTCCGG 14 4 0 

CACTATATTT GTGGCGTAAA AATGCCCCCG TGGACACCAC AAAG AC AAAA ACGCTCGATT 15 00 

TAGGAACTGC TTTCCACTGC CGGGTACTTG AAC CGG AAG A ATTCAGTAAC CGCTTTATCG 155 0 



WO 99/29837 



PCT7EP98/07945 



TAGCACCTGA ATTTAACCGC CGTACAAACG CCGGAAAAGA AGAAGAGAAA GCGTTTCTGA 162 0 

TGGAATGCGC AAGCACAGGA AAAACGGTTA TCACTGCGGA AGAAGGCCGG AAAATTGAAC 158 0 

TCATGTATCA AAGCGTTATG GCTTTGCCGC TGGGGCAATG GCTTGTTGAA AGCGCCGGAC 174 0 

ACGCTGAATC ATCAATTTAC TGGGAAGATC CTGAAACAGG AATTTTGTGT CGGTGCCGTC 130 0 
CGGACAAAAT TATCCCTGAA TTTCACTGGA TCATGGACGT G AAAACT AC G GCGGATATTC 
AACGATTCAA AACCGCTTAT TACGACTACC GCTATCACGT TCAGGATGCA TTCTACAGTC- 

ACGGTTATGA AGCACAGTTT GGAGTGCAGC CAACTTTCGT TTTTCTGGTT GCCAGCACAA 195 0 

CTATTGAATG CGGACGTTAT CCGGTrGAAA TTTTCATGAT GGGCGAAGAA GCAAAACTGG 2C4 0 
CAGGTCAACA GGAATATCAC CGCAATCTGC GAACCCTGTC TGACTGCCTG AATACCGATG 



1360 
1320 



2 1 C 0 

AATGGCCAGC TATTAAGACA TTATCACTGC CCCGCTGGGC TAAGGAATAT GCAAATGACT 216 0 

AAGCAACCAC CAATCGCAAA AGCCGATCTG CAAAAAACTC AGGGAAACCG TGCACCAGCA 222 0 

GCAGTTAAAA ATAGCGACGT GATTAGTTTT ATTAACCAGC CATCAATGAA AGAGCAACTG 22 6 0 

GCAGCAGCTC TTCCACGCCA TATGACGGCT GAACGTATGA TCCGTATCGC CACCACAGAA 234 0 

ATTCGTAAAG TTCCGGCGTT AGGAAACTGT GACACTATGA GTTTTGTCAG TGCGATCGTA 24 0 0 

CAGTGTTCAC AGCTCGGACT TGAGCCAGGT AGCGCCCTCG GTCATGCATA TTTACTGCCT 24 £0 

TTTGGTAATA AAAACGAAAA GAG CGGT AAA AAGAACGTTC AGCTAATCAT TGGCTATCGC 2 52 0 

GGCATGATTG ATCTGGCTCG CCGTTCTGGT CAAATCGCCA GCCTGTCAGC CCGTGTTGTC 2 530 

CGTGAAGGTG ACGAGTTTAG CTTCGAATTT GGCCTTGATG AAAAGTTAAT ACACCGCCCG 2 54 0 

GGAGAAAACG AAGATGCCCC GGTTACCCAC GTCTATGCTG TCGCAAGACT GAAAGACGGA 270C 

GGTACTCAGT TTGAAGTTAT GACGCGCAAA CAGATTG AC- C TGGTGCGCAG CCTGAGTAAA 2 76 0 

GCTGGTAATA ACGGGCCGTG GGTAACTCAC TGGGAAGAAA TGGCAAAGAA AACGGCTATT 2 320 

CGTCGCCTGT TCAAATATTT GCCCGTATCA ATTGAGATCC AGCGTGCAGT ATCAATGGAT 23S; 

GAAAAGGAAC CACTGACAAT CGATCCTGCA GATTCCTCTG TATTAACCGG GGAATACAGT 2S4 0 

GTAATCGATA ATTCAGAGGA ATAGATCTAA GCTTGGCTGT TTTGGCGGAT GAGAGAAGAT 2 00 0 

TTTCAGCCTG ATACAGATTA AATCAGAACG CAGAAGCGGT CTGATAAAAC AGAATTTGCC 3 060 

TGGCGGCAGT AGCGCGGTGG TCCCACCTGA CCCCATGCCG AACTCAGAAG TGAAACGCCG 312 0 

TAGCGCCGAT GGTAGTGTGG GGTCTCCCCA TGCGAGAGTA GGGAACTGCC AGGCATCAAA 3 1 3 C 

T AAAA CG AAA GGCTCAGTCG AAAGACTGGG CCTTTCGTTT TATCTGTTGT TTGTCGGTGA 3 24 0 

ACGCTCTCCT GAGTAGGACA AATCCGCCGG GAGCGGATTT G AACGTTG CG AAGCAACGGC 3 300 

CCGGAGGGTG GCGGGCAGGA CGCCCGCCAT AAACTGC CAG GCATCAAATT AAGCAGAAGG 3 3 60 

CCATCCTGAC GGATGGCCTT TTTGCGTTTC TACAAACTCT TTTGTTTATT TTTCTAAATA 3 42 0 

CATTCAAATA 7GTATCCGCT CATGAGACAA TAACCCTGA7 AAATG CTT C A ATAATATTGA 34 3 0 

AAAAGGAAGA GTATGAGTAT TCAACATTTC CGTGTCGCCC TTATTCC CTT TTTTGCGGCA 3 54 0 

TTTTGCCTTC CTGTTTTTGC TCACCCAGAA ACGCTGGTGA AAGTAAAAGA TGCTGAAGAT 3 600 
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CAGTTGGG7G CACGAGTGGG TTACATCGAA CTGGATCTCA ACAGCGGTAA GATCCTTGAG .3660 
AGTTTTCGCC CCGAAGAACG TTTTCCAATG ATGAGCACTT TTAAAGTTC7 GCTATGTGGC 3 72C 
GCGGT AT TAT CCCGTGTTGA CGCCGGGCAA GAGCAACTCG GTCGCCGCAT ACACTATTCT ' 3 7 80" 
C AG AATG AC T 7GGTTGAGTA CTCACCAGTC ACAGAAAAGC ATCTTACGGA TGGCATGACA 3 84 0 

GTAAGAGAAT TATGCAGTGC TGCCATAACC ATGAGTGATA ACACTGCGG Z CAACTTACTT 3 90 0 

CTGA.CAACGA TCGGAGGACC G AA G G A G C T A ACCGCTTTTT TGCACAACAT GGGGGATC AT 3S6C 

GTAACTCGCC TTGATCGTTG GGAACCGGAG CTGAATGAAG CCATACCAAA CGACGAGCGT 4020 

GACACCACGA TGCCTGTAGC AATGGCAACA ACGTTGCGCA AACTATTAAC TGGCGAACTA 4 0 30 

CTTACTCTAG CTTCCCGGCA ACAATTAATA GACTGGATGG AGG C GG AT AA AGTTGCAGGA 414 0 

CCACTTCTGC GCTCGGCCCT TCCGGCTGGC TGGTTTATTG CTGATAAATC TGGAGCCGGT 4 2 0C 

GAGCGTGGGT CTCGCGGTAT CATTGCAGCA CTGGGGCCAG ATGGTAAGCC CTCCCGTATC 4 260 

GTAGTTATCT ACACGACGGG GAGTCAGGCA ACTATGGATG AACGAAATAG AC AGATC GCT 4 3 20 

GAGATAGGTG CCTCACTGAT TAAGCATTGG TAACTGTCAG ACCAAGTTTA CTCATATATA 4 3 30 

CTTTAGATTG ATTTACGCGC CCTGTAGCGG CGCATTAAGC GCGGCGGGTG TGGTGGTTAC 4 44 0 

GCGCAGCGTG ACCGCTACAC TTGCCAGCGC CCTACCGCCC GCTCCTTTCG CTTTCTTCCC . 4 5 00 

TTCCTTTCTC GCCACGTTCG CCGGCTTTCC CCGTCAAGCT CTAAATCGGG GGCTCCCTTT 4 560 

AGGGTTCCGA TTTAGTGCTT TACGGCACCT CGACCCCAAA AAACTTGATT TGGGTGATGG 4 62 0 

TTCACGTAGT GGGCCATCGC C C TG AT AG AC GGTTTTTCGC CCTTTGACGT TGGAGTCCAC 4 6 8C 

GT7CTTTAAT AGTGGACTCT TGTTC C AAAC TTGAACAACA CTCAACCC7A TCTCGGGCTA 4 74 0 

T7CTTT7GA7 TTATAAGGGA TTTTGCCGAT TTCGGCCTAT TGGTTAAAAA ATGAGCTGAT 4 3C0 

TTAACAAAAA TTTAACGCGA ATTTTAACAA AATATTAACG TTTACAATTT AAAAGGATCT 4 36 0 

AGGTGAAGAT CCTTTTTGAT AATCTCATGA C C AAAA T C C C TTAACGTGAG TTTTCGTTCC 4 920 

ACTGAGCGTC AG AC C C C G T A GAAAAGATCA AAGGATCTTC TTGAGATCCT TTTTTTCTGC 4 980 

GCGTAATCTG CTGCTTGCAA ACAAAAAAAC CACCGCTACC AGCGGTGGTT TGTTTGCCGG 504 0 

ATCAAGAGCT ACCAACTCTT T-TTCCGAAGG TAACTGGCTT CAGCAGAGCG CAGATACCAA 510 0 

ATACTGTCCT TCTAGTGTAG CCGTAGTTAG GCCACCACTT CAAGAACTCT GTAGCACCGC 5150 

CTACATACCT CGCTCTGCTA ATCCTGTTAC CAGTGGCTGC TGCCAGTGGC GATAAGTCGT 52 2 0 

GTCTTACCGG GTTGGACTCA" AGACGATAGT TACCGGATAA GGCGCAGCGG TCGGGCTGAA 52 30 

CGGGGGGTTC GTGCACACAG CCCAGCTTGG AGCGAACGAC CTACACCGAA CTGAGATACC 534 0 

TACAGCGTGA GCTATGAGAA AGCGCCACGC TTCCCGAAGG GAGAAAGGCG GACAGGTATC 54 0 0 

CGGTAAGCGG CAGGGTCGGA ACAGGAGAGC GCACGAGGGA GCTTCCAGGG GGAAACGCCT 54 6 0 

GGTATCTTTA TAGTCCTGTC GGGTTTCGCC ACCTCTGACT TGAGCGTCGA TTTTTGTGAT 552 C 

GCTCGTGAGG GGGGCGGAGC CTATGGAAAA ACGCCAGCAA CGCGGCCTTT TTACGGTTCC 5 5 30 

TGGCCTTTTG CTGGCCTTTT GCTCACATGT TCTTTCCTGC GTTATCCCCT GATTCTGTGG 5 64 0 
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ATAACCGTAT TACCGCCTTT GAGTGAGCTG ATACCGCTCG CCGCAGCCGA ACGACCGAGC 
GCAGCGAGTC AG TGAGCGAG GAAGCGGAAG AGCGCCTGAT GCGGTATTTT CTCCTTACGC 
ATCTGTGCGG TATTTCACAC CGCATAGGGT CATGGCTGCG CCCCGACACC CGCCAACACC 
CGCTGACGCG CCCTGACGGG CTTGTCTGCT CCCGGCATCC GCTTACAGAC AAGCTGTGAC 
CGTCTCCGGG AGCTGCATGT GTCAGAGGTT TTCACCGTCA TCACCGAAAC GCGCGAGGCA 
GCAAGGAGAT GGCGCCCAAC AGTCCCCCGG CCACGGGGCC TGCCACCATA.CCCACGCCGA 
AACAAGCGCT CATGAGCCCG AAGTGGCGAG CCCGATCTTC CCCATCGGTG ATGTCGGCGA 
:AiA^C-C AGCAACCGCA CCTGTGGCGC CGGTGATGCC GGCCACGATG CGTCCGGCGT 
AGAGGATCTG CTCATGTTTG ACAGCTTATC 
(2> INFORMATION FOR SEQ ID NO: 2 : 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH : 34 3 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNES3 : both 

(D) TOPOLOGY: both 

(vii) IMMEDIATE SOURCE: 
(B) CLONE: t-recE 

( ix) FEATURE : 

(A) NAME /KEY : CDS 
(3) LOCATION: 1 . .84 3 

(D'l OTHER INFORMATION: /product = "t-recE" 
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2: 

SI- tsl oS"; G -T A A *f ?J A GAA GAC --a GAG CCA.GGT ATT TAT TAC GGA 
j ~ Sp "~ ' al ile Siu Asp He Glu Pro Gly He Tvr Tvr Glv 

± 5 10 " * 15 

JTI ll° £l GAG t AC C * C GCG GG ? CCC GGT ATC ACT AAG TCT CAG 

l— S-_ Asn Gx Asn ivr his Ala Gly Pro Gly He Ser Lvs Ser Gin 
20 25 30 

£15 HI So All I™ CCG GCA CTA TAT TTG " TGG CG * AAA AAT 

Lea A,p Asp He Ala Asp Thr Pro Ala Leu Tyr Leu Trp Arg Lvs Asn 

3 = 40 -45 

111 P^o S? t-n - CA ^ G A< L A ACG CTC GAT TTA GGA ACT GCT 

~ia. P o /al Aap Tnr ;nr Lys Thr Lys Thr Leu Asp Leu Glv- Thr Ala 

55 SO 

=be CvS 5™ vl* r T GAA CCG GAA GAA TTC AGT AAC CGC TTT ATC 

•« y Arg ^ Leu Glu Pro Gl" Glu Phe Ser Asn Arg Phe Ile 

5 70 75 * 80 

?I? S S5 f= ^ 1% ^ ^ - - - - 

i£ IS S a SI 3*35 ^.^^^.^^ 

100 105 no 



5700 
5760 
5320 
53S0 
5 r 40 
5000 

w v 'J ^ 

6150 



43 



144 



19: 



25S 
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GCG GAA GAA GGC CGG AAA ATT GAA CTC ATG TAT CAA AGC GTT ATG GGT 3 84 

Ala Glu Glu Gly Arg Lys lie Glu Leu Met Tyr Gin Ser Val Met Ala 
11= 120 125 

TTG CCG CTG GGG CAA TGG CTT GTT GAA AGC GCC GGA CAC GCT GAA TCA 4 32 

Leu Pro Leu Gly Glr. Trp Leu Val Glu Ser Ala Gly His Ala Glu Ser 
130 135 140 



TCA AT. TAC TGG GAA GAT CCT GAA ACA GGA ATT TTG TGT CGG TGC CGT 
Ser lie .yr Trp Glu Aso Pro Glu Thr Glv He Leu Cvs Arg Cvs Arg 
145 150 * 155 * " 150 



-iA ATT ATC CCT GAA TTT CAC TGG ATC ATG GAC GTG AAA ACT 5 2 3 

Pro Asp Lys lie lie Pro Glu Phe His Trp lie Met Asp Val Lys Thr 
lo5 170 " 175 

ACG GCG GAT ATT CAA CGA TTC AAA ACC GCT TAT TAC GAC TAC CGC TAT 57b 
Thr Ala Asp He Gin Arg Phe Lys Thr Ala Tyr Tyr Asp Tvr Arg Tvr 
ISO 135 , * 190 

CAC GTT CAG GAT GCA TTC TAC AGT GAC GGT TAT GAA GCA CAG TTT GGA 52 4 

Kis Val Gin Asp Ala Phe Tyr Ser Aso Giv Tvr Glu Ala Gin Phe Gly 
195 " 200 * ' * 205 

GTG CAG CCA ACT TTC GTT TTT CTG GTT GCC AGC ACA ACT ATT GAA TGC 5 72 

Val Gin Pro Thr Phe Val Phe Leu Val Ala Ser Thr Thr He Glu Cvs 
210 215 220 

GGA CGT TAT CCG GTT GAA ATT TTC ATG ATG GGC GAA GAA GCA AAA CTG 7 2C 

Gly Arg Tyr Pro Val Glu He Phe Met Met Gly Glu Glu Ala Lys Leu 
225 230 235 240 

GCA GGT CAA CAG GAA' TAT CAC CGC AAT CTG CGA ACC CTG TCT GAC TGC 763 
Ala Gly Gin Gin Glu Tyr His Arg Asn Leu Arg Thr Leu Ser Aso Cys 
245 250 255 

CTG AAT ACC GAT GAA TGG CCA GCT ATT AAG ACA TTA TCA CTG CCC CGC 3 15 

Leu Asn Thr Asp Glu Trp Pro Ala He Lvs Thr Leu Ser Leu Pro Arg 
25 C 265 270 

TGG GCT AAG GAA TAT GCA AAT GAC TAA 34 3 

Trp Ala Lys Glu Tyr Ala Asn Asp * 
27 5 7»0 



(2) INFORMATION FOR SSQ ID NO: 3: 

(i) SEQUENCE. CHARACTERISTICS: 



(A) LENGTH: 281 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

in) MOLECULE TYPE: pre rem 

(xi) SEQUENCE DESCRIPTION: 3EQ ID NO: 3: 

Met: Asp Pro Val lie Val Glu Asp lie Glu Pro Gly lie Tyr Tyr Gly 
1 5 10 *15 



:ie Ser Asn Glu Asn Tvr His AI3 3lv ;-r- G 1 v lie Ser 



3 0 



Leu Asp Asp He Ala Asp Thr Pro Ala leu Tyr Leu Trp Arg Lys Asn 

3 5 4 0 * 4 5 

Ala Pro Val Asp Thr Thr Lys Thr Lvs Thr Leu Aso Leu Gly Thr Ala 

50 55 " 6 0 



'u 1 vv, „s . 
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Phe His Cys Arg Val Leu Glu Pro Glu Glu Phe Ser Asn Arg Phe lie 

70 7 5 " 80 

Val Ala Pro Glu Phe Asn Arg Arg Thr Asn Ala Gly Lvs Glu Glu Glu 
85 90 95 

lys Ala Phe Leu Met Glu Cys Ala Ser Thr Glv Lvs Thr Val lie Thr 
103 105 * no 

Ala Glu Glu Gly Arg Lys lie Glu Leu Met Tvr Gin Ser Va^ Ala 
i - 3 120 125 

Leu Pro Leu Gly Gin Trp Leu Val Glu Ser Ala Glv His Ala Glu 

135 14 o 

Ser He Tvr Tro Glu Asp Pro Glu Thr Gly He Leu Cys Arg Cys Arg 



L45 150 



155 160 



Pro Asp Lys lis ii e Pro Giu Phe H is Trp lie Met Asp Val Lvs Thr 
1*5 170 175 

Thr Ala Asp lie Gin Arg Phe Lys Thr Ala Tyr Tyr Asp Tvr Arg Tyr 

180 185 - - - 3 1 



190 



His Val Gin Asp Ala Phe Tyr Ser Asp Gly Tyr Glu Ala Gin Phe Glv 

200 205 

Val Gin Pro Thr Phe Val Phe Leu Val Ala Ser Thr Thr He Glu Cys 

215 220 

Gly Arg Tyr Pro Val Glu He Phe Met Met Gly Glu Glu Ala Lvs Leu 

230 235 * 240 

Ala Gly Gin Gin Glu Tyr His Arg Asn Leu Arg Thr Leu Ser Aso Cys 
245 250 255 

Leu Asr. Thr Asp Glu Trp Pro Ala He Lys Thr Leu Ser Leu Pro Arg 
2o ° 265 270 

Trp Ala Lys Glu Tyr Ala Asn Aso * 
273 2 90 

(2; INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 810 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 

(vii! IMMEDIATE SOURCE: 
(3) CLONE: rscT 

(ix) FEATURE: 

(A) NAME/ KEY: CDS 

(B) LOCATION: 1 . .310 

(D) OTHER INFORMATION: /product= "red" 

(xi) SEQUENCE DESCRIPTION": SZQ ID NO: 4: 

Th T ^° <; CA CCA ATC GCA SAT CTG CAA AAA ACT CAG 

r:et Thr -ys Gin Pro Pro lie Al:, Lys Ala Aso Leu Gin Lys Thr Gin 
233 2 SO " 2 95 
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GGA AAC CGT GCA CCA GCA GCA GTT AAA AAT AGC GAC GTG ATT AGT TTT 
Gly Asn Arg Ala Pro Ala Ala Val Lys Asn Ser Asp Val lie Ser Phe 
300 ' 305 * 310 

ATT AAC CAG CCA TCA ATG AAA GAG CAA CTG GCA GCA GCT CTT CCA CGC 
lie Asr. Gir. Pro Ser Met Lys Glu Gin Leu- Ala Ala Ala Leu Pro Arg 
31a 320 325 



AiG ACG GCT GAA CGT ATG ATC CGT ATC GCC ACC AC A 
Mec Thr Ala Glu Arg Met He Arg He Ala Thr Thr Glu lie Arg 
335 340 345 



96 



AAA GTT „CG GCG TTA GGA AAC TGT GAC ACT ATG AGT TTT GTC AGT GCG 2 4 C 

Lys Val Pro Ala Leu Gly Asn Cvs Aso Thr Me: Ser Phe Val Ser Ala 
350 ' 355 350 

ATC GTA CAG TGT TCA CAG CTC GGA CTT GAG CCA GGT AGC GCC CTC GGT 236 
lie Val Gin Cys Ser Gin Leu Glv Leu Glu Pro Gly Ser Ala Leu Glv 
355 370 375 

CAT GCA TAT TTA CTG CCT TTT GGT AAT AAA AAC GAA AAG AGC GGT AAA 3 36 

His Ala Tyr Leu Leu Pro Phe Gly Asn Lys Asn Glu Lvs Ser Gly Lvs 
330 335 ' 390 

AAG AAC GTT CAG CTA ATC ATT GGC TAT CGC GGC ATG ATT GAT CTG GCT 3 34 

Lys Asr. Val Gin Leu lie He Gly Tyr Arg Gly Met He Asp Leu Ala 
39o 400 " * * 405 

CGC CGT TCT GGT CAA ATC GCC AGC CTG TCA GCC CGT GTT GTC CGT GAA 4 32 

Arg Arg Ser Gly Gin. He Ala Ser Leu Ser Ala Arg Val Val Arg Glu 
410 415 420 425 

GGT GAC GAG TTT AGC TTC GAA TTT GGC CTT GAT GAA AAG TTA ATA CAC 4 30 

Gly Asp Glu Phe Ser Phe Glu Phe Gly Leu Asp Glu Lys Leu He His 
43 0 * 43 5 ~ * 440 

CGC CCG GGA GAA .AAC GAA GAT GCC CCG GTT ACC CAC GTC TAT GCT GTC 528 
Arg Pro Gly Glu Asn Glu Asp Ala Pro Val Thr His Val Tvr Ala Val 
445 ~ 45C 455 

GCA AG A CTG AAA GAC GGA GGT ACT CAG TTT GAA GTT ATG ACG CGC AAA 5 7 6 

Ala Arg Leu Lys Asp Gly Gly Thr Gin Phe Glu Val Met Thr Arg Lys 
460 ' 465 470 

CAG ATT GAG CTG GTG CGC AGC CTG AGT AAA GCT GGT AAT AAC GGG CCG 6 24 

Gin lie Glu Leu Val Arg Ser Leu Ser Lys Ala Gly Asn Asn Gly Pro 
475 -480 * 48 5" 

TGG GTA ACT CAC TGG GAA GAA ATG GCA AAG AAA ACG GCT ATT CGT CGC 672 
Trp Val Thr His Trp Glu Glu Mec Ala Lvs Lvs Thr Ala lie Arg Arg 
490 495 ' 500 505 

CTG TTC AAA TAT TTG CCC GTA TCA ATT GAG ATC CAG CGT GCA GTA TCA 720 
Leu Phe Lys Tyr Leu Pro Val Ser He Glu He Gin Arg Ala Val Ser 
510 515 520 

ATG GAT GAA AAG GAA CCA CTG ACA ATC GAT CCT GCA GAT TCC TCT GTA 76 3 

Mec Asp Glu Lys Glu Pro Leu Thr He Asp Pro Ala Asp Ser Ser Val 
525 530 535 

TTA ACC GGG GAA TAC AGT GTA ATC GAT AAT TCA GAG GAA TAG 310 
Leu Thr Gly Glu Tyr Ser Val He Aso Asr. Ser Glu Glu 
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(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS- 

(A) LENGTH: 2 70 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

vXi) SaQUiNCE DESCRIPTION: SEQ ID NO: 5: 
Met Thr Lys Gin Pro Pro He Ala Lys Ala Asp Leu Gin Lvs Thr Gin 



_ln 

10 X5 



Gly Asa Arg Ala Pro Ala Ala Val Lys Asn Ser Asp Val He Ser P*~ 
■ 20 25 30 

He Asa Gin Pro Ser Mec Lys Glu Gin Leu Ala Ala Ala Leu Pro A-g 
3= 40 45 a 

Kis Met Thr Ala Glu Arg Met He Arg He Ala Thr Thr Giu He Arg 

D 5 6 0 

Lys Val Pro Ala Leu Gly Asn Cys Asp Thr Met Ser Phe Val Ser Al 



" 7 ° " 75 



.a 
30 

He Val Gin Cys Ser Gin Leu Gly Leu Glu Pro Glv Ser Ala Leu GH- 
85 90 95 ~~ 

His Ala Tyr Leu Leu Pro Phe Gly Asn Lys Asn Glu Lys Ser Gly Lvs 
100 105 110 ' 

Lys Asn Val Gin Leu He He Gly Tyr Arg Gly Met He Aso Leu" Ala 
H= 120 125 

Arg Arg Ser Gly Gin He Ala Ser Leu Ser Ala Arg Val Val Ara Glu 

135 140 

Gly- Asp Glu Phe Ser Phe Glu Phe Glv Leu Asd Glu Lvs L-u lie - s 

la 5 iso 

Arg Pro Gly Glu Asn Glu Asp Ala Pro Val Thr His Val Tvr Ala Val 
1 S3 170 ' 175 

Ala Arg Leu Lys Asp Gly Gly Thr Gin Phe Glu Val Met Thr Arg Lvs 
180 185 i9c 3 * 

Gin He Glu Leu Val Arg Ser Leu Ser Lys Ala Glv Asn Asn Gly Pro 
1S,S 20C " 205 

Tr? Ito HiS Trp GiU Mec Ala '-Y* L >'S Thr Ala lie Arg Arg 

215 220 

Leu Phe Lys Tyr Leu Pro Val Ser He Giu He Gin Arg Ala Val Ser 
225 2 30 235 240 

Met Asp Glu Lys Glu Pro Leu Thr He Aso Pro Ala Asd Ser Ser. Val 
243 250 255 

Leu Thr Gly Giu Tyr Ser Val He Asp Asn Ser Glu Glu • 

2 5 0 is: -> - „ 



(2) INFORMATION FOR SEQ ID NO: 6: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 376 base oairs 
(3) TYPE: nucleic acid 
to STRAND EDNESS : both 
(D) TOPOLOGY: both 

(vii) IMMEDIATE SOURCE : 
( 3 ') CLONE : araC 

(ix) FEATURE: 

f A) NAME / KEY : CDS 

(35 LOCATION: complement (I.. 376) 

(D) OTHER INFORMATION : /oroduc t = "araC" 



(XI ) 


SEQUENCE 


DESCRIPTION: SEQ I 


D NO : 6 : 






TGACAACTTG 


ACGGCTACAT 


CATTCACTTT 


TTCTTCACAA 


CCGGCACGGA 


ACTCGCTCGG 


6 0 


GCTGGCCCCG 


GTGCATTTTT 


TAAATACCCG 


CGAGAAATAG 


AGTTGATCGT 


C AAAA C C AAC 


i 2 ri 


ATTGCGACCG 


ACGGTGGCGA 


TAGGCATCCG 


GGTGGTGCTC 


AAAAGCAGCT 


TCGCCTGGCT 


ISC 


GATACGTTGG 


TCCTCGCGCC 


AGCTTAAGAC 


GCTAATCCCT 


AACTGCTGGC 


GGAAAAGATG 


24 0 


TGA'CAGACGC 


GACGGCGACA 


AGCAAACATG 


CTGTGCGACG 


CTGGCGATAT 


C .AAAAT TG C T 


3 00 


GTCTGCCAGG 


TGATCGCTGA 


TGTACTGACA 


AGCCTCGCGT 


ACCCGATTAT 


CCATCGGTGG 


360 


ATGGAGCGAC 


7CGT-TAATCG 


CTTCCATGCG 


CCGCAGTAAC 


AATTGCTCAA 


GCAGATTTAT 


4 2 0 


CGCCAGCAGC 


TCCGAATAGC 


GCCCTTCCCC 


TTGCCCGGCG 


TTAATGATTT 


G C C C AAAC AG 


4 80 


GTCGCTG AAA 


TGCGGCTGGT 


GCGCTTCATC 


CGGGCGAAAG 


AACCCCGTAT 


TGGCAAATAT 


540 


TGACGGCCAG 


TTAAGCCATT 


CATGCCAGTA 


GGCGCGCGGA 


C G AAAGT AAA 


CC CACTGGTG 


6 00 


ATA C CAT T C G 


CGAGCCTCCG 


GATGACGACC 


GTAGTGATGA 


ATCTCTCCTG 


GCGGGAACAG 


6 60 


C AAAAT AT C A 


CCCGGTCGGC 


AAACAAATTC 


TCGTCCCTGA 


TTTTTCACCA 


CCCCCTGAZZ 


72 0 


GCGAATGGTG 


AGATTGAGAA 


TATAACCTTT 


CATTCCCAGC 


GGTCGGTCGA 


TAAAAAAATC 


780 


GAGATAACCG 


TTGGCCTCAA 


TCGGCGTTAA 


ACCCGCCACC 


AGATGGGCAT 


TAAACGAGTA 


840 


TCCCGGCAGC 


AGGGGATCAT 


TTTGCGCTTC 


AGCCAT 






S"i 



(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 92 amino acids 

(B) TYPE: amino acid 
( D ) TO P O LOGY : linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Mec Ala Glu Ala Gin Asn Asp Pro Leu Leu Pro Giv Tvr Ser ?he Asn 
I 5 10 * * 15 

Ala His Leu 7a 1 Ala Gly Leu Thr Pro lie GIu Ala Asn Gly Tyr Leu 
20 25 30 
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Asp Phe Phe lie Asp Arg Pro Leu Gly Met Lvs Glv Tyr He Leu Asn 

40 45 

Leu Thr He Arg Gly Gin Gly Val Val Lys Asn Gin Gly Arg Glu ov, e 

55 60 

Val cys Arg Pre Gly Asp lie Leu Leu Phe Pro Pro Glv Glu H is 

70 75 30 

His Tyr Gly Arg His ?ro Glu Ala Arg Glu Ttd Tvr His G'n T -a Val 

s = 90 95 

Tyr Phe Arg Pro Arq Ala Tv T-m u; ^ m.. -r. — T _.. « _ 

- - * - — c -»-.-> w^.^ij.jj uc nan i rrj fro ser 

x "'~ 105 no 

Ha Phe Ala Asn Thr Gly Phe Phe Arg Pro Asp Glu Ala His Gin °ro 

120 12 5 

His Phe Ser Asp Leu Phe Gly Gin He lie Asn Ala Gly Gin Gly Glu 



140 



Gly Arg Tyr Ser Glu Leu Leu Ala He Asn Leu Leu Glu Gin Leu Leu 

150 155 16 0 

Leu Arg Arg «ec Glu Ala He Asn Glu Ser Leu His Pro Pro Met Asp 

lo = 170 175 

Asn Arg Val Arg Glu Ala Cys Gin Tyr He Ser Asp His Leu Ala Aso 
i8 ° 185 190 

Ser Asn Phe Asp He Ala Ser Val Ala Gin His Val Cys Leu Ser Pro 
135 200 2 05 

Ser Arg Leu Ser His Leu Phe Arg Gin Gin Leu Gly He Ser Val 'Leu 
210 215 220 

Ser Trp Arg Glu Asp Gin Arg He Ser Gin Ala Lvs Leu Leu Leu Ser 

230 235 240 

Thr Thr Arg Met Pro He Ala Thr Val Gly Arg Asn Val Glv Phe Aso 
<='5 250 * -55 

Asp Gin Leu Tyr Phe Ser Arg Val Ph e Lys Lys Cvs Thr Glv Ala Ser 
260 265 270 

Pro Ser Glu Phe Arg . Ala Gly Cys Glu Glu Lys Val Asn Aso Val Ala 

280 285 

Val Lys Leu Ser - ' 

290 



(2) INFORMATION FOR SEQ ID 



NO: 3 



SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 861 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 



(vii) IMMEDIATE SOURCE: 
(B) CLONE: bla 
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( IX) FEATURE: 

(A) NAME /KEY: CDS 
(3) LOCATION: 1 . . 861 

(D) OTHER INFORMATION : /produce = "hia" 

(xi) SEQUENCE DESCRIPTION: SEQ ID MO: 8: 

ATG AGT ATT CAA . CAT. TTC CGT GTC GCC CTT ATT CCC TTT TTT GCG GCA 
"e: Ser He Glr. His Phe Arg Val Ala Leu He Pro ?he Phe Ala A.la 
^95 " 300 ■ 305 

— i ivjw — - '^C_ GTT TTT GCT CAC CCA GAn ACG CTG GTG AAA GTA AAA 
Phe Cy s Leu Pre Val Phe Ala His Pro Glu Thr Leu Val Lvs Val Lys 
310 315 320 

GAT GCT GAA GAT CAG TTG GGT GCA CGA GTG GGT TAG ATC GAA CTG GAT 
Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly Tyr He Glu Leu Asp 
325 330 335.. 340 

CTG AAC AGC GGT AAG ATC CTT GAG AGT TTT CGC CCC GAA GAA CGT TTT 
Leu Asr. Ser Gly Lys He Leu Glu Ser Phe Arg Pro Glu Glu Arg Phe 
345 350 355 

CCA ATG ATG AGC ACT TTT AAA GTT CTG CTA TGT GGC GCG GTA TTA TCC 
Pro Me: Xe: Ser Thr Phe Lys Val Leu Leu Cys Gly Ala Val Leu Ser 
360 365 370 

CGT GTT GAC GCC GGG CAA GAG CAA CTC GGT CGC CGC ATA CAC TAT TCT 
Arg Val Asp Ala Gly Gin Glu Gin Leu Gly Arg Arg He Kis Tyr Ser 
375 380 * 385 

CAG AAT GAC TTG GTT GAG TAC TCA CCA GTC ACA GAA AAG CAT CTT ACG 
Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr Glu Lys His Leu Thr 
390 395 400 

GAT GGC ATG ACA GTA AGA GAA TTA TGC AGT GCT GCC ATA ACC ATG AGT 
Me- Thr Val Arg Glu Leu Cys Ser Ala Ala He Thr Met Ser 
410 415 420 



Asc Glv v 



GAT AAC ACT GCG GCC AAC TTA CTT CTG ACA ACG ATC GGA GGA CCG AAG 
Asp Asr. Thr Ala Ala Asn Leu Leu Leu Thr Thr He Gly Glv Pro Lvs 
425 430 * " 435 

GAG CTA ACC GCT TTT TTG CAC AAC ATG GGG GAT CAT GTA ACT CGC CTT 
Glu Leu Thr Ala Phe Leu His Asn Met Gly Asd His Val Thr Arg Leu 
440 445 * " 450 

GAT CGT TGG GAA CCG GAG* CTG AAT GAA GCC ATA CCA AAC GAC GAG CGT 
Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala He Pro Asn Asp Glu Arg 
455 46 0 465 

GAC ACC ACG ATG CCT GTA GCA ATG GCA ACA ACG TTG CGC AAA CTA TTA 
Asp Thr :hr Me: Pro Val Ala Met Ala Thr Thr Leu Arg Lys Leu Leu 
47 0 475 480 

ACT GGC GAA CTA CTT ACT CTA GCT TCC CGG CAA CAA TTA ATA GAC TGG 
Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu He Asp Trp 
435 4.90 495 * 500 

ATG GAG GCG GAT AAA GTT GCA GGA CCA CTT CTG CGC TCG GCC CTT CCG 
Me: Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu Pro 
505 " 510 " 515 

GCT GGC TGG TTT ATT GCT GAT AAA TCT GGA GCC GGT GAG CGT GGG TCT 
Ala Gly Trp Phe He Ala Asp Lys Ser Glv Ala Gly Glu Arg Glv Ser 
520 525 * * 530 



C ^ S? T rT° GCA GCA CTG GGG CCA GAT GGT ccc ~CC CGT ATC ' lea* 

Arg Gly lie I.e Ala Ala Leu Gly Pro Asp Gly Lys Pro Ser Arg lie 



540 545 

^ TA f, T T A T C TAC ACG ACG GOG AGT CAG GCA ACT ATG GAT GAA CGA AAT 
Val Val lie Tyr Thr Thr Gly Ser Gin Ala Thr Met Asp Glu Arg Asn 

5 5:> soy 

AGA CAG ATC GCT GAG ATA GGT GCC TCA CTG ATT AAG CAT TGG TAA 
v rg Gin He A. a G.u lie Gly Ala Ser Leu lie Lvs His Tro * 

570 575 * 

( 7 ) TM-r,DM^TTo\ T t^-i ^ — ^ ^ ^ „ 

, _ , i- v_- il; l\ju : y : 

(l) SEQUENCE CHARACTERISTICS: 
lA) L-NGTH: 237 amine acids 
(3) TYPE: amino acid 
CD) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Met Ser lie Gin His Phe Arg Val Ala Leu lie Pro ?he Phe Ala Ala 
1 5-io 15 

Phe Cys Leu Pro Val Phe Ala His Pro Glu Thr Leu Val Lys Val Lvs 
20 25 30 

Asp Ala Glu Asp Gin Leu Gly Ala Arg Val Gly Tvr He Glu Leu Asp 
35 40 *45 

Leu Asn Ser Gly Lys He Leu Glu Ser Phe Arg Pro Glu Glu Arg Phe 
50 55 go 

Pro Met Me: Ser Thr Phe Lys Val Leu Leu Cvs Glv Ala Val Leu Se- 

70 75 * 80 

Arg Val Asp Ala Gly Gin Glu Gin Leu Glv Arg-Ara Tie His Tyr Ser 
25 90 95 

Gin Asn Asp Leu Val Glu Tyr Ser Pro Val Thr Glu Lvs His Leu Thr 

100 105 * 110 

Asp Gly Met Thr Val Arg Glu Leu' Cys Ser Ala Ala He Thr Met Ser 
1X - * 120 125 

Asp Asn Thr Ala Ala Asn Leu Leu Leu Thr Thr He Glv Glv Pro Lvs 
130 135 140 

Glu Leu Thr Ala Phe Leu His Asn Met Glv Aso His Val Thr Arg L~u 

Asp Arg Trp Glu Pro Glu Leu Asn Glu Ala lie Pro Asn Aso Glu Arg 
155 170 " 175 

Asp Thr Thr Met Pro Val 'Ala Met Ala Thr Thr Leu Arg Lvs Leu Leu 
130 135 190 

Thr Gly Glu Leu Leu Thr Leu Ala Ser Arg Gin Gin Leu He Aso Tro 
195 200 205 

Met Glu Ala Asp Lys Val Ala Gly Pro Leu Leu Arg Ser Ala Leu ^ro 
210 215 220 

Ala Gly Trp Phe He Ala Asp Lys Ser Gly Ala Gly Glu Arg Gly Ser 



816 



8 51 
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225 230 235 -24 0 

Arg Gly lie lie Ala Ala Leu Glv Pro Asd Gly Lvs Pro Ser Arg lie 
245 250 ' 255 

Val Val lie 7yr Thr Thr Glv Ser Gin Ala Thr Me: Asp Glu Arg Asn 
260 255 270 

Arg Glr. He Ala GIu lie Gly Ala Ser Leu lie Lvs His Trc 
2^5 230 ~ 2 35 " 

(2) I NFORMAT I CM FOR SEQ ID NC : 10: 

; 1 > SEQUENCE CHARACTERI ST ICS : 

(A! LENGTH: 71 9 5 base pairs 
(3) TYPE: nucleic acid 

(C) STRANDEDNESS : both 

(D) TOPOLOGY: both 

(vii) IMMEDIATE SOURCE: 

(E) CLONE: pBAD-ETgamma 

(ix? FEATURE: 

(A) NAME/KEY : mi sc_f ea ture 

(B) LOCATION: 3588 4004 

(D) OTHER INFORMATION: /product = "red gamma" 



(xi) 


SEQUENCE 


DESCRIPTION: SEQ ID NO: 10: 






ATCGATGCAT 


AATGTGCCTG 


TCAAATGGAC 


GAAGCAGGGA 


TTCiGCAAAC 


CCTATGCTAC 


60 


TCCGTCAAGC 


CGTCAATTGT 


CTGATTCGTT 


ACCAATTATG 


ACAACTTGAC 


GGCTACATCA 


120 


-iCACT.T.T 


CTTCACAACC 


GGCACGGAAC 


TCGCTCGGGC 


TGGCCCCGGT 




150 


AA7ACCCGCG 


AGAAATAGAG 


TTGATCGTCA 


AAACCAACAT 


TGCGACCGAC 


GGTGGCG AT A 


24 0 


GGCA7CCGGG 


7GG7GC7CAA 


AAGCAGCTTC 


GCCTGGCTGA 


TACGTTGGTC 


CTCGCGCCAG 


2 00 


CTTAAGACGC 


7AA7CCC7AA 


CTGCTGGCGG 


AAAAGATGTG 


ACAGACGCGA 


CGGCGACAAG 


S6 0 


CAAACATGCT 


GTGCGACGCT 


GGCGATATCA 


AAATTG CTGT 


CTGCCAGGTG 


ATCGCTGATG 


4 2 0- 


TACTGACAAG 


CCTCGCGTAC 


C£GATTATCC 


ATCGGTGGAT 


GGAGCGACTC 


GTTAATCGCT 


430 


TCCA7GCGCC 


2-CAG7AACAA 


TTGCTCAAGC 


AGATTTATCG 


CCAGCAGCTC 


CGAATAGCGC 


54 0 


CCT7CCCCTT 


3CCCGGCGTT 


AATGATTTGC 


CCAAACAGGT 


CGCTGAAATG 


CGGCTGGTGC 


6 00 


GCTTCATCCG 


GGCGAAAGAA 


CCCCGTATTG 


GC AAATATTG 


ACGGCCAGTT 


AAGCCATTCA 


6 50 


TGCCAGTAGG 


CGCGCGGACG 


AAAGTAAACC 


CACTGGTGAT 


ACCATTCGCG 


AGCCTCCGGA 


720 


TGACGACCGT 


AGTGATGAAT 


CTCTCCTGGC 


GGGAACAGCA 


AAATATCACC 


CGGTCGGCAA 


730 


AC AAAT 7 CT C 


GTCCCTGATT 


TTTCACCACC 


CCCTGACCGC 


GAATGGTGAG 


ATTGAGAATA 


34 0 


TAACC-T-CA 


TTCCCAGCGG 


TCGGTCGA7A 


AAAAAAT C G A 


GATAACCGTT 


GGCCTCAATC 


90 0 


GGCGTTAAAC 




ATGGGCATTA 


AACGAGTATC 


CCGGCAGCAG 


GGGATCATTT 


960 


TGCGCTTCAG 


CCATACTTTT 


CATACTCCCG 


CCATTCAGAG 


AAGAAACCAA 


TTGTCCATAT 


1020 



.SCO 
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TGCATCAGAC ATTGCCGTCA CTGCGTCTTT TACTGGCTCT TCTCGCTAAC CAAACCGGTA 1*0 80 
ACCCCGCTTA TTAAAAGCAT TCTGTAACAA AGCGGGACCA AAGCCATGAC AAAAACGCGT A 114 0 
AACAAAAGTG TCTATAATCA CGGCAGAAAA GTCCACATTG ATTATTTGCA -CGGCGTCACA 12 CO 
CTTTGCTATG CCATAGCATT TTTATCCATA AGATTAGCGG ATCCTACCTG ACGCTTTTTA 1260 
TCGCAACTCT CTACTGTTTC TCCATACCCG TTTTTTTGGG CTAGCAGGAG GAATTCACCA 13 2 0 
TGGATCCCGT AAT C G T AG AA G AC AT AG AG C CAGGTATTTA TTACGGAATT TCGAATGAGA 13 3 0 
ATTACCACGC GGGTCCCGGT ATCAGTAAGT CTCAGCTCGA TGACATTGCT GATACTCCGG 14 4 0 
CACTATATTT G TG G C G T AAA AATGCCCCCG TGGACACCAC AAAGACAAAA ACGC7CGATT 

TAGGAACTGC TTTCCACTGC CGGGTACTTG AACCGGAAGA ATTGAGTAAC CGCTTTATCG 15 

TAGCACCTGA ATTTAACCGC CGTACAAACG CCGGAAAAGA AGAAGAGAAA GCGTTTCTGA 152 0 

TGGAATGCGC AAGCACAGGA AAAACGGTTA TCACTGCGGA AGAAGGCCGG AAAATTGAAC 16 3 0 

TCATGTATCA AAGCGTTATG GCTTTGCCGC TGGGGCAATG GCTTGTTGAA AGCGCCGGAC 174 0 

ACGCTGAATC ATCAATTTAC TGGGAAGATC CTGAAACAGG AATTTTGTGT CGGTGCCGTC 1SCC 

CGGACAAAAT TATCC.CTGAA TTTCACTGGA TCATGGACGT GAAAACTACG GCGGATATTC 135 0 

AACG ATT C AA AACCGCTTAT TACGACTACC GCTATCACGT TCAGGATGCA TTCTACAGTG 192 0 

ACGGTTATGA AGCACAGTTT GGAGTGCAGC CAACTTTCGT TTTTCTGGTT GCCAGCACAA 193 0 

C T ATTG AATG CGGACGTTAT CCGGTTGAAA TTTTCATGAT GGG CGAAGAA GCAAAACTGG 2 04 0 

CAGGTCAACA GGAATATCAC CGCAATCTGC GAACCCTGTC TGACTGCCTG AATACCGATG 210 0 

AATGGCCAGC TATTAAGACA TTATCACTGC CCCGCTGGGC TAAGGAATAT GCAAATGACT 216 0 

AGATCTCGAG GTACCCGAGC ACGTGTTGAC AATTAATCAT CGGCATAGTA TAT CGGC AT A 222 3 

GTATAATACG ACAAGGTGAG GAACTAAACC ATGGCTAAGC AACCACCAAT CGC AAAAGC C 22 5 1 

GAT C TG C AAA AAACTCAGGG AAAC C GTGC A CCAGCAGCAG TTAAAAATAG CGACGTGATT 22 4 0 

AGTTTTATTA ACCAGCCATC AATG AAA GAG CAACTGGCAG CAGCTCTTCC ACGCCATATG 24 0 j 

ACGGCTGAAC GTATGATCCG TATCGCCACC ACAGAAATTC GTAAAGTTCC GGCGTTAGGA ' 24 5 0 

AACTGTGACA CTATGAGTTT TGTCAGTGCG ATCGTACAGT GTTCACAGCT CGGACTTGAG 2 52 0 

CCAGGTAGCG CCCTCGGTCA TGCATATTTA CTGCCTTTTG GTAATAAAAA CGAAAAGAGC 2 5 30 

G G T AAAAAG A ACGTTCAGCT AATCATTGGC TAT CG CGGC A TGATTGATCT GGCTCGCCGT 2 5 4 0 

TCTGGTCAAA TCGCCAGCCT GTCAGCCCGT GTTGTCCGTG AAGGTGACGA GTTTAGCTTC 27 0 0 

GAATTTGGCC TTGATGAAAA GTTAATACAC CGCCCGGGAG AAAACGAAGA TGCCCCGGTT 27 5 0 

ACCCACGTCT ATGCTGTCGC AAGACTGAAA GACGGAGGTA CTCAGTTTGA AGTTATGACG 2 320 

CGCAAACAGA TTGAGCTGGT GCGCAGCCTG AGTAAAGCTG GTAATAACGG GCCGTGGGTA 2 8 30 

ACTCACTGGG AAGAAATGGC AAAGAAAACG GCTATTCGTC GCCTGTTCAA ATATTTGCCC 2 94 0 
G TAT C AATTG AGATCCAGCG TGCAGTATCA ATGGATGAAA AGGAACCACT GACAATCGA7 



3000 



CCTGCAGATT CCTCTGTATT AACCGGGGAA TACAGTGTAA TCGATAATTC AGAGGAATAG 3 0 60 
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ACCTAAGCTT CCTGCTGAAC AT C AAA G G C A AGAAAACATC TGTTGTCAAA GACAGCATCC 312 0 

TTGAACAAGG A C AAT T AAC A GTTAACAAAT AAAAACGCAA AAGAAAATGC CGATATCCTA 3 1 80 

TTGGCATTTT CTTTTATTTC TT AT C AAC AT AAAGGTGAAT CCCATACCTC GAGCTTCACG 3240 

C7GCCGCAAG C AC 7 CAGGGC GCAAGGGCTG CTAAAAGGAA GCGGAACACG TAGAAAGCCA 3 3 00 

G T 7 2 G C A 7- AA ACGGTGCTGA CCCCGGATGA ATGTCAGCTA CTGGGCTATC TGGACAAGGG 3 3 50 

AAAACGCAAG CG C AAAG AG A AAGCAGGTAG CTTGCAGTGG GCTTACATGG CG AT AG CT AG 2 4 2 0 

AC7GGGCGGT TT7ATGGACA GCAAGCGAAC CGGAATTGCC AGCTGGGGCG CCCTCTGGTA 3 4 80 

A G G T 7GG G AA GCCCTGC AAA GTAAACTGGA TGGCTTTCTT GCGGCCAAGG ATCTGATGGC 3 54 0' 

GCAGGGGATC AAGATCTGAT CAAGAGACAG GATGAGGATC GTTTCGCATG GAT AT T AAT A 3 6 0 :■ 

CCGAAACTGA GATCAAGCAA AAGCATTCAC TAACCCCCTT TCCTGTTTTC CTAATCAGCC 366 C 

CGGCATTTCG CGGGCGATAT TTTCACAGCT ATTTCAGGAG TTCAGCCATG AACGCTTATT 3 72C 

AC ATT C AG G A TCGTCTTGAG G C TC AG AG C T GGGCGCGTCA CTACCAGCAG CTCGCCCGTG 3 7 8 0 

AAGAGAAAGA GGCAGAACTG GCAGACGACA TGGAAAAAGG CCTGCCCCAG CACCTGTTTG 3 64 0 

AA.TCGCTATG CATCGATCAT TTGCAACGCC ACGGGGCCAG C AAAAAAT C C ATTACCCGTG 3 90C 

CGTTTGATGA CGATGTTGAG TTTCAGGAGC GCATGGCAGA ACACATCCGG TACATGGTTG 3 96C 

AAA C C AT T G C TCACCACCAG GTTGATATTG ATTCAGAGGT ATAAAACGAG TAGAAGCTTG 4 02 0 

GCTGTTTTGG CGGATGAGAG AAGATTTTCA GCCTGATACA GATTAAATCA GAACGCAGAA 4 06 0 

GCGGTCTGAT AAAACAGAAT TTGCCTGGCG GCAGTAGCGC GGTGGTCCCA CCTGACCCCA 414 0 

TG C CG AAC T C AG A A G T G AAA CGCCGTAGCG CCGATGGTAG TGTGGGGTCT CCCCATGCGA 42 00 

G AG T AGGG AA CTGCCAGGCA TCAAA.TAAAA CGAAAGGCTC AGTCGAAAGA CTGGGCCTTT 426 3 

CGCTTTATCT GTTGTTTGT2 GG7GAACGCT CTCCTGAGTA GGACAAATCC GCCGGGAGCG 4 32 0 

G AT T TG AA C 3 TTGCGAAGCA ACGGCCCGGA GGGTGGCGGG CAGGACGCCC GCC AT AAACT 4 3 SO 

GCCAGGCATC AAATTAAGCA GAAGGCCATC CTGACGGATG GCCTTTTTGC GTTTCTAC.AA 444 0 

ACTCTTTTGT TTATTTTTCT AAATACATTC AAATATGTAT CCGCTCATGA GACAATAACC 4 50 0 

CTGATAAATG CTTCAATAAT ATTGAAAAAG GAAGAGTATG AGTATTCAAC ATTTCCGTGT 4 56 0 

CGCCCTTATT CCC TTTTTTG CGGCA7TTTG CCTTCCTGTT TTTGCTCACC CAGAAACGCT 4 62 0 

G G T G AAAG T A AAAGATGCTG AAGA.TCAGT7 GGGTGCACGA GTGGGTTACA TCGAACTGGA 4550 

T C T C AAC AG C GGTAAGATCC TTGAGAGTTT TCGCCCCGAA GAACGTTTTC CAATGATGAG 4 74 0 

CACTTTT.AAA GTTCTGCTAT GTGGCGCGGT ATTATCCCGT GTTGACGCCG GGCAAGAGCA 4 80 0 

ACTCGGTCGC CGC AT AC ACT ATTCTCAGA-. TGACTTGGTT GAGTACTCAC CAGTCACAGA 4 85 0 

AAAGCATC7T ACGGATGGCA TGACAGTAAG AGAATTA.TGC AGTGCTGCCA TAACCATGAG 4920 

T GAT AAC ACT GCGGCCAACT TACTTCTGA Z A-.7GA7 2 3GA GGACCGAAGG AG CT AAC CGC 4 930 

TTTTTTGCAC AACATGGGGG ATCATGTAAO 7C3C7TTGAT CGTTGGGAAC CGGAGCTGAA 5040 

TGAAGCCATA CCAAACGACG AGCGTGACAC CA2GATGCCT GTAGCAATGG CAACAACGTT 5 ICC 



I 7 
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GCGCAAACTA TTAACTGGCG AACTACTTAC TCTAGCTTCC CGGCAACAAT TAATAGACTG 
GATGGAGGCG GATAAAGTTG CAGGACCACT TCTGCGCTCG GCCCTTCCGG CTGGCTGGTT 
TATTGCTGAT AAATCTGGAG CCGGTGAGCG TGGGTCTCGC GGTATCATTG CAGCACTGGG 
GCCAGATGGT AAGCCCTCCC GTATCGTAGT TATCTACACG ACGGGGAGTC AGGCAACTAT 
GGATGAACGA AATAGACAGA TCGCTGAGAT AGGTGCCTCA CTGATTAAGC ATTGGTAACT 
GTCAGACCAA 3TTTACTCAT ATATACTTTA GATTGATTTA CGCGCCCTGT AGCGGCGCAT 
TAAGCGCGGC GGGTGTGGTG GTTACGCGCA GCGTGACCGC TACACTTGCC AGC3CCCTAG 
CGCCCGC7CC TTTC3C7TTC TTCCCTTCCT TTCTCGCCAC GTTCGCCGGC TTTCCCCGTC 
AAGCTCTAAA TCGG3GGCTC CCTTTAGGGT TCCGATTTAG TGCTTTACGG CACCTCGACC 
CCAAAAAACT • TGATTTGGGT GATGGTTCAC GTAGTGGGCC ATCGCCCTGA TAGACGGTTT 
TTCGCCCTTT GACGTTGGAG TCCACGTTCT TTAATAGTGG ACTCTTGTTC CAAACTTGAA 5 750 
CAACACTCAA CCCTATCTCG GGCTATTCTT TTGATTTATA AGGGATTTTG CCGATTTCGG 
CCTATTGGTT AAAAAATGAG CTGATTTAAC AAAAATTTAA CGCGAATTTT AACAAAATAT 
TAACGTTTAC AATTTAAAAG GATCTAGGTG AAGATCCTTT TTGATAATCT CATGACCAAA 
ATCCCTTAAC GTGAGTTTTC GTTCCACTGA GCGTCAGACC CCGTAGAAAA GATCAAAGGA 
TCTTCTTGAG ATCCTTTTTT TCTGCGCGTA ATCTGCTGCT TGCAAACAAA AAAACCACCG 
CTACCAGCGG TGGTTTGTTT GCCGGATCAA GAGCTACCAA CTCTTTTTC C GAAGGTAACT 612 0 
GGCTTCAGCA GAGCGCAGAT ACCAAATACT GTCCTTCTAG TGTAGCCGTA GTTAGGCCAC 
CACTTCAAGA ACTCTGTAGC ACCGCCTACA TACCTCGCTC TGCTAATCCT GTTACCAGTG 
GCTGCTGCCA GTGGCGATAA GTCGTGTCTT ACCGGGTTGG ACTCAAGACG ATAGTTACCG 6 30 0 
GATAAGGCGC AGCGGTCGGG CTGAACGGGG GGTTCGTGCA CACAGCCCAG CTTGGAGCGA S35C 
ACGACCTACA CCGAACTGAG' ATACCTACAG CGTGAGCTAT GAGAAAGCGC CACGCTTCCC 6-120 
GAAGGGAGAA AGGCGGACAG GTATCCGGTA AGCGGCAGGG TCGGAACAGG AGAGCGCACG 64 3 0 
AGGGAGCTTC CAGGGGGAAA CGCCTGGTAT CTTTATAGTC CTGTCGGGTT TCGCCACCTC 6 54 0 
TGACTTGAGC GTCGATTTTT GTGATGCTCG TCAGGGGGGC GGAOCCTATG GAAAAACGCC 660 0 
AGCAACGCGG CCTTTTTACG GTTCCTGGCC TTTTGCTGGC CTTTTGCTCA CATGTTCTTT 66 5 0 
CCTGCGTTAT CCCCTGATTC TGTG3ATAAC C3TATTACCG CCTTTGAGTG AGCTGATACC 67- C 
GCTCGCCGCA GCCGAACGAC CGAGCGCAGC GAGTCAGTGA GCGAGGAAGC GGAAGAGCGC 57 SO 
CTGATGCGGT ATTTTCTCCT TACGCATCTG TGCGGTATTT CACACCGCAT AGGGTCATGG 5 84 0 
CTGCGCCCCG ACACCCGCCA ACACCCGCTG ACGCGCCCTG ACGGGCTTGT CTGCTCCCGG 6 90 0 
CATCCGCTTA CAGACAAGCT GTGACCGTCT CCGGGAC-CTG CATGTGTCAG AGGTTTTCAC 6 96 0 
CGTCATCACC GAAACGCGCG AGGCAGCAAG 3A3AT33C3C CCAACAGTCC CCCGGCCACG 702 0 
GGGCCTGCCA CCATACCCAC GCCGAAACAA 3C3CTCAT3A GCCCGAAGTG GCGAGCCCGA 70 3 0 
TCTTCCCCAT CGGTGATGTC GGCGATATAG GCGCCAGCAA CCGCACCTGT GGCGCCGGTG 714 0 



5160 
5220 
5230 
5340 
5400 
54 60 
5520 
5530 
5540 
5700 



5320 
53 80 
5940 
6000 
5060 



5180 
6240 
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ATGCCGGCCA CGATGCGTCC GGCGTAGAGG ATCTGCTCAT GTTTGACAGC TTATC 7195 
(2) INF ORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS : 

(A;- LENGTH: 7010 base oairs 
i 3 ) TYPE: nucleic acid' 
(C: STRANDEDNESS : both 

topology: both 

(vii; IMMEDIATE SOURCE: 

( 3 , ; CLONE: p3AD - a I pna - be :a- gamma 

(ix) FEATURE: 

(A) NAME /KEY: CDS 

(3) LOCATION : 1320 . . 2000 

(D) OTHER INFORMATION: /product = "red alpha" 

(ix! FEATURE: 

(A) NAME /KEY: CDS 

(3) LOCATION :2086. .2371 

(D) OTHER INFORMATION: /product = "red beta" 

(ix) FEATURE : 

(A; NAME /KEY : CDS 

(B) LOCATION :3403. .3319 

(D) OTHER INFORMATION: /product = "red gamma" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 



ATCGATGCAT 


AATG T 3 C CTG 


TCAAATGGAC 


GAAGCAGGGA 


TTCTGCAAAC 


CCTATGCTAC 


■SC 


TCCGTCAAGC 


CGT CAA7TGT 


CTGATTCGCT 


ACCAATTATG 


ACAACTTGAC 


GG CT AC AT C A 


12C 


TTCACTTT.T 


CTTrACAACC 


GG C AC GG AAC 


TCGCTCGGGC 


TGGCCCCGGT 


GCATTTTT.A 


ISC 


AAT AC CCGCG 


AG AAAT AG AG 


TTGATCGTCA 


AAACCAACAT 


TGCGACCGAC 


GGTGGCGATA 


240 


GGCATCCGGG 


TGGTGC7CAA 


AAGCAGCTTC 


GCCTGGCTGA 




CTCGCGCCAG 


300 


CTTAAGACGC 


TAATCCCTAA 


CTGCTGGCGG 


AAAAGATGTG 


ACAGACGCGA 


C GGCG AC AAG 


36 J 


CAAACATGCT 


GTGCGACGCT 


GGCG ATATCA 


AAATTGCTGT 


CTGCCAGGTG 


ATCGCTGATG 


-•J 2 0 


T A C TG AC AAG 


OCT Z G CGT AC 


C CG AT TAT CC 


ATCGGTGGAT 


GGAGCGACTC 


GTTAATCGCT 


4 30 


TCCATGCGCC 


GCAG7AACAA 


TTGCTCAAGC 


AG ATTT AT CG 


CCAGCAGCTC 


CGAATAGCGC 


3-4 0 


CCTTCCCCTT 


GCCCGGCGTT 


AATGATTTGC 


C C AAAC AG GT 


CGCTGAAATG 


CGGCTGGTGC 


600 


GCTTCATCCG 


GGCGAAAGAA 


CCCCGTATTG 


G C AAAT AT TG 


ACGGCCAGTT 


AAGCCATTCA 


65 0 


TGCCAGTAGG 


CGCGCGGACG 


AAAG T AAA C C 


CACTGGTGAT 


ACCATTCGCG 


AGCCTCCGGA 


7 / o 


TGACGACCGT 


AG7GATGAAT 


CTCTCCTGGC 


GGGAACAGCA 


AAATATCACC 


CGGTCGGCAA 


7 30 


ACAAATTCTC 


GTCCCTGATT 


TTTCACCACC 


CCCTGACCGC 


GAATGGTGAG 


ATTGAGAATA 


8-i 0 


TAACCMTCA 


TTCCCAGCGG 


TCGGTCGATA 


AAAAAATCGA 


GATAACCGTT 


GGCCTCAATC 


90 0 


GGCGTTAAAC 


CCGCCACCAG 


ATGGGCATTA 


AACGAGTA7C 


CCGGCAGCAG 


GGGATCATTT 
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TGCGCTTCAG C C AT ACTTTT CATACTCCCG C C ATTC AG AG AAGAAACCAA TTGTCCATAT io 

i GCATCAGAC ATTGCCGTCA CTGCGTCTTT TACTGGCTCT TCTCGCTAAC CAAACCGGTA 10 

ACCCCGCTTA TT AAAAG CAT TCTGTAACAA AGCGGGACCA AAGCCATGAC AAAAACGCGT 11 

AACAAAAGTG TCTATAATCA CGGCAGAAAA GTCCACATTG ATTATTTGCA CGGCGTCACA 12 

CTTTGCTATG CCATAGCATT TTTATCCATA AGATTAGCGG ATCCTACCTG ACGCTTTTTA 12 

TCGCAACTCT CTACTGTTTC TCCATACCCG TTTTTTTGGG CTAGCAGGAG GAATTCACC 13 

ATG ACA CCG GAC ATT ATC CTG CAG CGT ACC GGG ATC GAT GTG AHA r,rr 
me. x... rrc Asp ^ e lie Leu Gin Arg- Thr Giy He Asp Val Arg Ala 
"~ 300 

GTC GAA CAG GGG GAT GAT GCG TGG CAC AAA TTA CGG CTC GGC GTC ATC ■ - 

" n Giy A3? Asp Ala Tr P His Lys Leu Arg Leu Glv Val He 
305 . 310 315 

-br SST ^ ^* GAC ^ GTG ATA GCA AAA CCC CGC TCC GGA AAG 14 6 

*hr A,a ber Glu Val His Asn Val He Ala Lys Pro Arg Ser Glv Lvs 

325 330 * 335 



151 



155 



^ TGG CCi GA^ ATG AAA ATG TCC TAC TTC CAC ACC CTG CTT GCT GAG 
L^s Trp Pro Asp Met Lys Met Ser Tyr Phe His Thr Leu Leu Ala Glu 
340 " 345 350 

vlT rZ^ tS° S? T ?- T ? G ? T CCG GAA GTT GCT ^ GCA CTG GCC TGG 

Val Cvs Thr Giy Val Axa Pro Glu Val Asn Ala Lys Ala Leu Ala Trp 

355 360 365 

GGA AAA CAG TAC GAG AAC GAC GCC AGA ACC CTG TTT GAA TTC ACT TCC 16 0 

-ys -In Tyr Giu Asn Asp Ala Arg Thr Leu Phe Glu Phe Thr Ser 
3/0 375 380 

r?- C vl? ^ GT T ^ C J GAA TCC CCG ATC ATC TAT CGC GAC GAA AGT ATG 16 5 

Y*Z Asn Th -- G-u Ser Pro He He Tyr Arg Aso Giu Ser Met 

33z > 390 395 

CGT ACC GCC TGC TCT CCC GAT GGT TTA TGC AGT GAC GGC AAC GGC CTT I7r 

An% ^ YS ?r ° AS? Gly Leu C ^ 3 Ser As P Asn Glv Leu 

,J5 4i0 415 

GAA -CTG AAA TGC CCG TTT ACC TCC CGG GAT TTC ATG AAG TTC CGG CTC 
Glu Leu Lys Cys Pro Phe Thr Ser Arg Asp Phe Met Lvs Phe Arg Leu 
4 ^° 425 430 



175 



Gil r?3 ll C ^ C A T A ^ TCA GCT TAC ATG GCC CAG GTG CAG TAC 17 9 

Gly Gly Phe Gxu Ala lie Lys Ser Ala Tyr Met Ala Gin Val Gin Tvr 

43 = 440 445 

cf C t T< t J GG GTG ACG CGA AAT GCC TGG TAC TTT GCC AAC TAT GAC 13 4 

S__ Met Trp Val Thr Arg Lys Asn Ala Trp Tvr Phe Ala Asn Tvr Asp 
450 455 460 

CCG CGT ATG AAG CGT GAA GGC CTG CAT TAT GTC GTG ATT GAG CGG GAT ^39 
Pro Arg Met Lys Arg Glu Gly Leu Kis Tvr Val Val lie Glu Arg Asd 
465 470 * 475 

GAA AAG TAC ATG GCG AGT TTT GAC GAG ATC GTG CCG GAG TTC ATC GAA 
Glu Lys ,yr Met Ala Ser Phe Asp Glu lie Val Pro Glu Phe lie Glu 
430 4 35 490 495 

AAA ATG GAC GAG GCA CTG GCT GAA ATT GGT TTT GTA TTT GGG GAG CAA 19 9' 

Lys Met Asp Glu Ala Leu Ala Glu lie Gly Phe Val Phe Glv Glu Gin 
. 500 505 510 
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TGG CGA TAG- ATCCGGTACC CGAGCACGTG TTGACAATTA ATCATCGGCA 2 04 0 

Trp Arg * 

TAGTATATCG GCATAGTATA ATACGACAAG GTGAGGAACT AAACC ATG AGT ACT 2 0 94 

Met Ser Thr 
1 

GCA CTC GCA ACG CTG GCT GGG AAG CTG GCT GAA CGT GTC GGC ATG GAT 2 14 2 

Ala Leu Ala Thr Leu Ala GIv Lvs Leu Ala Giu Arg Val- Glv Me: Aso 
5 10 15 

TCT GTC GAC CCA CAG GAA CTG ATC ACC ACT CTT CGC CAG ACG GCA TTT 2 1?C 

Ser Val Asp Fro Gin Glu Leu lie Thr Thr Leu Arg Gin Thr Ala ?he 
2 0 2 5 3 0 3 5 

AAA GGT GAT GCC AGC GAT GCG CAG TTC ATC GCA TTA CTG ATC GTT GCC 2 23 3 

Lys Giy Asp Ala Ser Asp Ala Gin Phe lie Ala Leu Leu lie Val Ala 

40 45 50 

AAC CAG TAC GGC CTT AAT CCG TGG ACG AAA GAA ATT TAC GCC TTT CCT .2266 
A sr. Gin Tyr Giy Leu Asn Pro Trp Thr Lys Glu lie Tyr Ala Phe Pro 
55 60 65 

GAT AAG CAG AAT GGC ATC GTT CCG GTG GTG GGC GTT GAT GGC TGG TCC 2 334 

Asp Lys Gin Asn Giy lie Val Pro Val Val Glv Val Asd Glv Trp Ser 
70 75 80 

CGC ATC ATC AAT GAA AAC CAG CAG TTT GAT GGC ATG GAC TTT GAG CAG 2 3 32 

Arg lie lie Asn Glu Asn Gin Gin Phe Asp Glv Me: Aso Phe Glu Gin 
85 90 95 

GAC AAT GAA TCC TGT 1 ACA TGC CGG ATT TAC CGC AAG GAC CGT AAT CAT 243 0 

Asp Asn Glu Ser Cys Thr Cys Arg lie Tyr Arg Lys Asp Arg Asn His 
100 105 110 - * 115 

CCG ATC TGC GTT ACC GAA TGG ATG GAT GAA TGC CGC CGC GAA CCA TTC 24 78 

Pro lie Cys Val Thr Glu Trp Met Aso Glu Cvs Arg Arg Glu Pro Phe 
120 125 130 

AAA ACT CGC GAA. GGC AGA GAA ATC ACG GGG CCG TGG CAG TCG CAT CCC 2 5 2 -J 

Lys Thr Arg Glu Giy Arg Glu lie Thr Glv Pro Trp Gin Ser His Pro 
125 140 " 145 

AAA CGG ATG TTA CGT CAT AAA GCC ATG ATT CAG TGT GCC CGT CTG GCC 2 57 4 

Lys Arg Met Leu Arg His Lys Ala Met: lie Gin Cvs Ala Arg Leu Ala 
150 155 * 160 

TTC GGA TTT GCT GGT ATC. TAT GAC AAG GAT GAA GCC GAG CGC ATT GTC 262 2 

Phe Giy Phe Ala Giy lie Tyr Asp Lys Asp Glu Ala Glu Arg lie Val 
165 170 * " 175 

GAA AAT ACT GCA TAC ACT GCA GAA CGT CAG CCG GAA CGC GAC ATC ACT 2 6 70 

Glu Asn Thr Ala Tyr Thr Ala Glu Arg Gin Pro Glu Arg Asp He Thr 
150 155 19C " " 195 

CCG GTT AAC GAT GAA ACC ATG CAG GAG ATT AAC ACT CTG CTG ATC GCC 2713 
Pro Val Asn Asp Glu Thr Met Gin Glu He Asn Thr Leu Leu He Ala 
200 205 210 

CTG GAT AAA ACA TGG GAT GAC GAC TTA TTG CCG CTC TGT TCC CAG ATA 276 5 

Leu Asp Lys Thr Trp Asp Asp Asp Leu Leu Pro Leu Cys Ser Gin lie 
215 * 220 225 

TTT CGC CGC GAC ATT CGT GCA TCG TCA GAA CTG ACA CAG GCC GAA GCA 2 314 

Phe Arg Arg Asp lie Arg Ala Ser Ser Glu Leu Thr Gin. Ala Glu Ala 
230 235 240 
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GTA AAA GCT CTT GGA TTC CTG AAA CAG AAA GCC GCA GAG CAG AAG GTG 
Lys Aia Leu Gly Phs Leu Lys Gin Lys Ala Ala Glu Gin Lys Val 
. 250 255 

GCA GCA TAG ATCTCGAGAA GCTTCCTGCT GAACATCAAA GGCAAGAAAA - , 

a j. a Ala * *-^»X 

260 

CATCTGTTGT CAAAGACAGC ATCCTTGAAC AAGGACAATT AACAGTTAAC AAATAAAAAC 2 97 1 

G C AAAAG AAA ATGCCGATAT CCTATTGGCA TTTTCTTTTA TTTCTTATCA ACATAAAGGT 3 031 

GAATCCCATA CCTCGAGCTT C.^CGCTGCCG CAAGCACTCA GGGCGCAAGG GCTGCTAAAA 3C91 

GGAAGCGGAA CAr"?:.":iaj ~<-r~r*r- „„„„„ „ 

„^^„ u . u ^ uk . rt-^rt^rtuiovjiij i-iijrti_i_i_l_Uvi A TCjAATGTCA 315 1 

GCTACTGGGC TATCTGGACA AGGGAAAACG CAAGCGCAAA GAGAAAGCAG GTAGCTTGCA 3 211 

GTGGGCTTAC ATGGCGATAG CTAGACTGGG CGGTTTTATG GACAGCAAGC GAACCGGAAT 3271 

TGCCAGCTGG GGCGCCCTCT GGTAAGGTTG GGAAGCCCTG CAAAGTAAAC TGGATGGCTT 3 331 

TCTTGCCGCC AAGGATCTGA TGGCGCAGGG GATCAAGATC TGATCAAGAG ACAGGATGAG ' 3 3 91 

GATCGTTTCG C ATG GAT ATT AAT ACT GAA ACT GAG ATC AAG CAA AAG CAT 344 i 

Met Asp lie Asn Thr Glu Thr Glu He Lys Gin Lvs His 
1 S io 

lit f! A ^ C n CC T T T CCT GTT TTC CTA ATC AGC CCG GCA TTT CGC GGG 34 3 9 

Ser Leu Thr Pro Phe Pro Val Phe Leu He Ser Pro Ala Phe Arg Gly 

5 20 25 

* GA t AT ll T S AC ~ GC TAT TTC AGG AGT TCA GCC ATG AAC GCT TAT TAC 353"? 
Arg Tyr Phe His Ser Tyr Phe Arg Ser Ser Ala Met Asn Ala Tvr^Tvr 

. 35 40 * "--45 

ATT CAG GAT CGT CTT GAG GCT CAG AGC TGG GCG CGT CAC TAC CAG CAG 
x.e Gin Asp Arg Leu Glu Ala Gin Ser TrD Ala Arg His Tvr Gin Gin 
=° 55 " 60 

?1 C ? CC ^ GAG GAG GCA GAA CTG GCA GAC GAC ATG GAA AAA 35 ~ 3 

i—u Ala Arg Giu Giu Lys Glu Ala Glu Leu Aia Asd Asp Me: Glu Lvs 
S - 70 75 

GGC CTG CCC CAG CAC CTG TTT GAA TCG CTA TGC ATC GAT CAT TTG CAA 358^ 
G,y Leu. Pro Gin His ■ Leu , Phe Glu Ser Leu Cys lie Asd His Leu Gin 
80 85 go 

CGC CAC GGG GCC AGC AAA. AAA TCC ATT ACC CGT GCG -TTT GAT GAC GAT 3 72S 

Arg His Gly Ala Ser Lys Lys Ser lie Thr Arg Ala Phs Asd Asd Asd 
93 100 10S " " .- 

f-IT o AG IF CAG GAG CGC ATG GCA GAA CAG ATC CGG TAC ATG GTT GAA 3 77- 

Val Glu Phe Gin Glu Arg Mec Ala Glu His He Arg Tvr Met Val Glu 

x -° 115 120 125 

ACC ATT GCT CAC CAC CAG GTT GAT ATT GAT TCA GAG GTA TAA 3 8^9 

Thr lie Ala His His Gin Val- Asp He Asd Ser G^u Val - 
130 13 5 



2585 



AACGAGTAGA AGCTTGGCTG TTTTGGCGGA TGAGAGAAGA TTTTCAGCCT GATACAGATT 3 37 9 

AAATCAGAAC GCAGAAGCGG TCTGATAAAA CAGAATTTGC CTGGCGG CAG TAGCGCC-GTG 2 93 9 

GTCCCACCTG ACCCCATGCC GAACTCAGAA GTGAAACGCC GTAGCGCCGA TGGTAGTGTG 3 999 

GGGTCTCCCC ATGCGAGAGT AGGGAACTGC CAGGCATCAA ATAAAACGAA AGGCTCAGTC 4 05 9 
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GAAAGACTGG GCCTTTCGTT TTATCTGTTG TTTGTCGGTG AACGCTCTCC TGAGTAGGAC 4119 
AAATCCGCCG GGAGCGGATT TGAACGTTGC GAAGCAACGG CCCGGAGGGT GGCGGGCAGG 4179 

ACGCCCGCCA TAAACTGCCA GGCATCAAAT TAAGCAGAAG GCCATCCTGA CGGATGGCCT 423 9 

TTTTGCGTTT C7ACAAACTC TTTTGTTTAT TTTTCTAAAT ACATTCAAA7 ATGTATCCGC 4 2 99 

7 CAT GAG AC A ATAACCCTGA TAAATGCTTC AATAATATTG AAAAAGGAAG AG7A7GAGTA 4 3 59 

TTCAACATTT CCGTGTCGCC CTTATTCCCT TTTTTGCGGC A7TTTGCC77 CC7GT7T7TG 4" 4 I 9 

C7CACCCAGA AACGCTGGTG AAAG T AAAAG ATGCTGAAGA TCAGTTGGGT GCACGAGTGG 4 4 79 

GTTACATCGA ACTGGATCTC AACAGCGGTA. AGATCCTTGA GAGTTTTCGC CCCGAAGAAC 453 9 

GTTTTCCAAT GATGAGCACT TTTAAAGTTC TGCTATGTGG CGCGGTATTA TCCCGTGTTG 4 5 99 

ACGCCGGGCA AGAGCAACTC GGTCGCCGCA TACACTATTC TCAGAATGAC TTGGTTGAGT 4 6 59 

ACTCACCAGT CACAGAAAAG CATCTTACGG ATGGCATGAC AGTAAGAGAA TTATGCAGTG 4719 

CTGCCATAAC CATGAGTGAT AACACTGCGG CCAACTTACT TCTGACAACG ATCGGAGGAC 4 77? 

CGAAGGAGCT AACCGCTTTT TTGCACAACA TGGGGGATCA TGTAACTCGC CTTGATCGTT 4 33 9 

GGGAACCGGA GCTGAATGAA GCCATACCAA ACGACGAGCG TGACACCACG ATGCCTGTAG -4 399 

CAATGGCAAC AACGTTGCGC AAACTATTAA CTGGCGAACT ACTTACTCTA GCTTCCCGGC 4 95 9 

AACAATTAAT AGACTGGATG GAGGCGGATA AAGTTGCAGG ACCACTTCTG CGCTCGGCCC 5019 

TTCCGGCTGG CTGGTTTATT GCTGATAAAT CTGGAGCCGG TGAGCGTGGG TCTCGCGGTA 50 7 9 

TCATTGCAGC ACTGGGGCCA GATGGTAAGC CCTCCCGTAT CGTAGTTATC TACACGACGG 513 9 

GGAGTCAGGC AACTATGGAT GAACGAAATA GACAGATCGC TGAGATAGGT GCCTCACTGA 5135 

T7AAGCATTG G7AAC7GTCA GACCAAGTTT AC7CA7ATAT ACTTTAGAT7 GA77TACGCG 52 5 9 

CCCTGTAGCG GCGCA7TAAG CGCGGCGGGT G7GGTGGTTA CGCGCAGCG7 GACCGCTACA 5319 

CT7GCCAGCG CCCTjkOCGCC CGCTCCTTTC GCTTTCTTCC CTTCCTTTC7 CGCCACG7TC 5 3 79 

GCCGGCTTTC CCCGTCAAGC TC7AAATCGG GGGCTCCCTT TAGGGTTCCG A7TTAGTGCT 54 3 9 

TTACGGCACC TCGACCCCAA AAAACTTGAT TTGGGTGATG GTTCACGTAG TGGGCCATCG 5 4 99 

C C C TG AT AG A CGGTTTTTCG CCGTTTGACG TTGGAGTCCA CGTTCTTTAA TAGTGGACTC 5 5 59 

TTGTTCCAAA CTTGAACAAC ACTCAACCCT ATCTCGGGCT ATTCTTTTGA TTTATAAGGG 5 519 

ATTTTGCCGA 77TCGGCCTA T7GGTTAAAA AATGAGCTGA TT7AACAAAA AT7TAACGCG 5 5 79 

AATTTTAACA AAATATTAAC GTTTACAATT TAAAAGGA.TC TAGGTGAAGA TCCTTTTTGA 573 9 

TAATCTCATG ACCAAAATCC CTTAACGTGA GTT7TCGTTC CACTGAGCGT CAGACCCCGT 5 7 99 

AGAAAAGATC AAAGGATCTT CT7GAGATCC T7TTTT7CTG CGCGTAATC7 GCTGCTTGCA 5 35 9 

AACAAAAAAA CCACCGCTAC CAGCGGTGGT 7TGTT7GCCG GAT C AAG AG C TACCAACTCT 5 915 

T77TCCGAAG GTAACTGGCT TCAGCAGAGC GCAGATACCA AATACTGTCC TTCTAGTGTA 5 9 79 

G CCGTAGTT A GGCCACCAC7 TCAAGAACTC TG7AGCACCG CCTACATACC TCGCTCTGCT 603 9 

AATCCTGTTA CCAGTGGCTG C7GCCAGTGG CGATAAGTCG TGTCTTACCG GGTTGGACTC 5 0 99 
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AAGACGATAG TTACCGGATA AGGCGCAGCG GTCGGGCTGA ACGGGGGGTT CGTGCACACA 
GCCCAGCTTG GAGCGAACGA CCTACACCGA ACTGAGATAC CTACAGCGTG AGCTATGAGA 
AAGCGCCACG CTTCCCGAAG GGAGAAAGGC GGACAGGTAT CCGGTAAGCG GCAGGGTCGG 
AACAGGAGAG CGCACGAGGG AGCTTCCAGG GGGAAACGCC TGGTATCTTT ATAGTCCTGT 
CGGGTTTCGC CACCTCTGAC TTGAGCGTCG ATTTTTGTGA TGCTCGTCAG GGGGGCGGAG 
CCTATGGAAA AACGCCAGCA ACGCGGCCTT TTTACGGTTC CTGGCCTTTT GCTGGCCTTT 
TGCTCACATG TTCTTTCCTG CGTTATCCCC TGATTCTGTG GATAACCGTA TTACCGCCTT 

TGAGTGAGCT GATSrr.TTTr nr~r-r*r**r~r.r*~ , , . 

-v.^.^,^^^^^ rtrtu^MC^UAo CGCAGCGAGT CAGTGAGCGA 

GGAAGCGGAA GAGCGCCTGA TGCGGTATTT TCTCCTTACG CATCTGTGCG GTATTTCACA 
CCGCATAGGG TCATGGCTGC GCCCCGACAC CCGCCAACAC CCGCTGACGC GCCCTGACGG 
GCTTGTCTGC TCCCGGCATC CGCTTACAGA CAAGCTGTGA CCGTCTCCGG GAGCTGCATG 
TGTCAGAGGT TTTCACCGTC ATCACCGAAA CGCGCGAGGC AGCAAGGAGA TGGCGC CCAA 
CAGTCCCCCG GCCACGGGGC CTGCCACCAT ACCCACGCCG AAACAAGCGC TCATGAGCCC 
GAAGTGGCGA GCCCGATCTT CCCCATCGGT GATGTCGGCG ATATAGGCGC CAGCAACCGC 
ACCTGTGGCG CCGGTGATGC CGGCCACGAT GCGTCCGGCG TAGAGGATCT GCTCATGTTT 
GACAGCTTAT C 

(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 227 amino acids 
(B; TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Met Thr Pro Asp He He Leu Gin Arg Thr Gly He Asp Val Arg Ala 

= 10 15 

Val Glu Gin Gly Asp Asp. Ala Trp His Lys Leu Arg- Leu Gly Val ti- 
-° 25 30 

Thr Ala Ser Glu Val His Asn Val lie Ala Lvs Pro Arg Ser Glv Lvs 
35 40 45 

Lys Trp Pro Asp Met Lys Met Ser Tyr Phe Kis Thr Leu Leu Ala Glu 
50 55 60 

Val Cys Thr Gly Val Ala Pro Glu Val Asn Ala Lys Ala Leu Ala Trp 
65 70 75 8 £ 

Gly Lys Gin Tyr Glu Asn Asp Ala Arg Thr Leu Phe Glu Phe Thr Ser 
85 90 95 



Gly Val Asn Val Thr Glu Ser Pro lie He Tvr Arg Aso Glu Ser Met 
100 105 HO 

Arg Thr Ala Cys Ser Pro Asp Gly Leu Cys Ser Asd Glv Asn Gly Leu 

115 120 * 125 



6153 

621S 

6279 

6 33 9 

6399 

545 r 

5519 

557i 

5525 

6599 

675? 

69ir 

5S7-- 
6939 
6999 
7010 
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Glu Leu Lys Cys Pro Phe Thr Ser Arg Asp Phe Met Lys Phe Arg Leu 
130 135 140 

Gly Gly Phe Glu Ala lie Lys Ser Ala Tyr Met Ala Gin Val Gin Tvr 
1-15 150 155 160 

Ser Met Trp Val Thr Arg Lys Asn Ala Trp Tyr Phe Ala Asn Tyr Asp 
155 170 ' 175 

Pro Arg Met Lys Arg Glu Gly Leu His Tyr Val Val lie Glu Arg Asp 
130 185 " 190 

Glu Lys Tyr Met Ala Ser Phe Asp Glu lie Val Pro Glu Phe lie Glu 
195 200 205 

Lys Met Asp Glu Ala Leu Ala Glu lie Glv Phe Val Phe Glv ..Glu Gin 
210 215 * 220 

Tro Arg 
225 

(2) INFORMATION FOR SSQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 262 amino acids 
(3) TYPE : amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Met Ser Thr Ala Leu Ala Thr Leu Ala Gly Lys Leu Ala Glu Arg Val 
1 5 10 15 

Gly Met Asp Ser Val Asp Pro Gin Glu Leu lie Thr Thr Leu Arg Gin 
20 25 30 

Thr Ala Phe Lys Gly Asp Ala Ser Asp Ala Glr. Phe lie Ala Leu Leu 
35 40 45 

He Val Ala Asn Gin Tyr Glv Leu Asn Pro Tro Thr Lys Glu He Tyr 
50 55 " 60 

Ala Phe Pro Asp Lys Gin Asn Gly He Val Pro Val Val Gly Val Asp 
65 70 75 80 

Gly Trp Ser Arg He He Asn Glu Asn Gin Gin Phe Asp Gly Met Asp 

85 90 95 

Phe Glu Gin Asp Asn Glu Ser Cys Thr Cys Arg lie Tyr Arg Lys Asp 
100 105 * 110 

Arg Asn His Pro He Cvs Val Thr Glu Trp Met Asp Glu Cvs Arg Arg 
115 120 125 

Glu Pro Phe Lys Thr Arg Glu Gly Arg Glu He Thr Gly Pro Trp Gin 
130 135 140 

Ser His Pro Lys Arg Met Leu Arg His Lvs Ala Met He Gin Cys Ala 
145 150 * 155 * 150 

Arg Leu Ala Phe Gly Phe Ala Gly lie Tyr Asp Lys Asp Glu Ala Glu 
165 170 " " 175 
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Arg lie Val Glu Asn Thr Ala Tyr Thr Ala Glu Arg Gin Pro Glu Arg 
130 185 190 

Asp He Thr Pro Val Asn Asp Glu Thr Met Gin Glu He Asn Thr Leu 
195 200 205 

Leu He Ala Leu Asp Lys Thr Trp Asp Asp Asp Leu Leu Pro Leu Cvs 



210 



215 220 



Ser, Gin He Phe Arg Arg Asp He Arg Ala Ser 3er Glu Leu Thr Gin 

230 235 240 

Ala Glu Ala Val Lys Ala Leu Gly Phe Leu Lys Gin Lys Ala Ala Glu 
245 250 255 

Gin Lys Val Ala Ala * 
260 

(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 9 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 

Met Asp He Asn Thr Glu Thr Glu He Lys Gin Lys His Ser Leu Thr 
1 5 io 15 - 

Pro Phe Pro Val Phe* Leu He Ser Pro Ala Phe Arg Glv Ara Tvr Phe 
20 25 * 30 * 

His Ser Tyr Phe Arg Ser Ser Ala Met Asn Ala Tyr Tvr He Gin Asp 
3 = 40 45 

Arg Leu Glu Ala Gin Ser Trp Ala Arg Kis Tvr Gin Gin Leu Ala Arg 
50 55 * 60 

Glu Glu Lys Glu Ala Glu Leu Ala Asp Aso Met Glu Lys Glv Leu Pro 
65 70 75 * Q0 

Gin His Leu Phe Glu Ser Leu Cys He Asp His Leu Gin Arg His Gly 

35 90 95 

Ala Ser Lys Lys Ser He Thr Arg Ala Phe Aso Asd Asp Val Glu Phe 
100 los " * 110 

Gin Glu Arg Met Ala Glu His . He Arg Tvr Met Val Glu Thr He Ala 
l:L 5 120 125 

His His Gin Val Asd He Asp Ser Cu Val 
130 ' 13S 
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Table I: Sequences ofOiigos for PCR 



Figure 3ab 

left: TGACCCCTCACAAGGAGACGACCTTCCATGACCGAGTACAAGAGGGATGTAACGCACTGA 
right: TACAAATGTGGTATGGCTGATTATGATCCTCTAGAGTCGGTGCTCACTGCCCGCTTTCCA 
template: pJP5603 
targeting vector: pSV-pazi i 

Figure 3c 

a-left: CTTCCATG ACCGAGTACAAG AGGG ATGTAACGCACTGA 

a-right: ATG ATCCTCTAG AGTCGGTGCTCACTGCCCGCTTTCCA 

b-left: AGACGACCTTCCATGACCGAGTACAAGAGGGATGTAACGCACTGA 

b-right: GCTG ATT ATG ATCCTCTAG AGTCGGTGCTCACTGCCCGCTTTCCA 

c-Ieft: CACAAGGAGACGACCTTCCATG ACCGAGTACAAG AGGG ATGTAACGCACTGA 

c-right: TGGTATGGCTGATTATG ATCCTCTAG AGTCGGTGCTCACTGCCCGCTTTCCA 

d-!eft: TGACCCCTC AC A AGG AG ACGACCTTCC ATG ACCGAGTACAAG AGGG ATGTAACGCACTGA 

d-right: TACAAATGTGGTATGGCTGATTATGATCCTCTAG AGTCGGTGCTCACTGCCCGCTTTCCA 

e-ieft: 

CACGCCCCTGACCCCTC AC A AGG AG ACGACCTTCC ATG ACCGAGTACAAG AGGG ATGTAACGCACTGA 
e-right: 

TA A A ACCTCTAC A A ATGTGGTATGGCTG A TTATG ATCCTCTAG AGTCGGTGCTCACTGCCCGCTTTCCA 
f-left: 

TCCCCTGACCCACGCCCCTGACCCCTC AC A AGG AG ACGACCTTCC ATG ACCGAGTACAAG AGGG ATG T 

AACGCACTGA 

f-right: 

TAAAGC A AGTA A A ACCTCTAC AAATGTGGTATGGCTGATTATG ATCCTCTAG AGTCGGTGCTC A CTGCC 
CGCTTTCCA 
template: pJP5603 
targeting vector: pSV-pazi 1 

Figure 3d 
a-ieft: 

TCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGTGCAGGATCAAGGGCTGCTAAAGGAA 
a-right: 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAGCAT 
b-ieft: 

CACGAGCATCATCCTCTGCATGGTCAGGTCATGGATGAGCAGACGATGGCAAGGGCTGCTAAAGGAA 
b-right: 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAGCAT 
c-Ieft: 

TTAACCGTCACGAGCATCATCCTCTGCATGGTCAGGTCATGGATGAGCACAAGGGCTGCTAAAGGAA 
c-right: 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAGCAT 
d-left: 

TGCTGCTGAACGGCAAGCCGTTGCTGATTCGAGGCGTTAACCGTCACGACAAGGGCTGCTAAAGGAA 
d-right: 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAGCAT 
e-left: 

TCTCTATCGTGCGGTGGTTGAACTGCACACCGCCGACfiGCACGCTGATTCAAGGGCTGCTAAAGGAA 
e-right: 

TAATGCG A AC AG CGC ACGGCGTTA A AGTTGTTC IV.C I I C \ I C AGCAGGATGGCGAAGAACTCCAGCAT 
f-left: 

TGGAGTGACGGCAGTTATCTGGAAGATCAGGA I A I GTGGCGGATGAGCGCAAGGGCTGCTAAAGGAA 
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f-right: 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAGCA 
g-iert. 

TGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCAAGGGCTGCTAAAGGAA 
«. right: ' ■ '"^^j-v.-i 

TAATGCGAACAGCGCACGGCGTTAAAGTTGTTCTGCTTCATCAGCAGGATGGCGAAGAACTCCAGCAT 
h-left: ' 

TGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATGTCAAGGGCTGCT-\AAGGAA 
h-right: 

TATTTTTG AC ACC AG ACC A A CTGGT A ATGGTAGCGACCGGCGCTCAGCTGGCG A AG A ACTCC A GC AT 



tar^etins sector: pS V-paz! ! 

Figure 4 
left: 

I^^^^Z^^^^^"^^'^^^'^^^^^"^^ '^CGACTGAATCCGGTGAGAATGGCAAAAGCTTATGCCCACCAGC 

TGGTATGGCTGATTATGATC 

right: 

TCCAACATGGATGCTGATTTATATGGGTATAAATGGGCTCGCGATAATGTCGGGCAATCAGGTGCGACA 
ATCTACCACCAGCTCTTTTCTACGGGGTCTGACGC ULOACA 
template: pBR322 
targeting vector: Hoxa-PI 

Figure 5 
left: 

TGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGCAGTTAATACGACTCACTATAGGGAGAACA 

GGAAACAGCTATGCCCATAACACCCAGAGTA 

right: 

GGGGTGCCGCTCAGT 

template: pmtrx (a pBluescipt vector carrying mouse trithorax cDNA) 
targeting vector: pZero2.l 

Figure 6 

left: 

™ A ^ A $ AATAACCCTGA ^^^ 
GGATATACCACCG 

right: 

I A ^ A 5 GGCGCGTAAATCAATC ^ 
CACTCATCGCA 

template: pMAK705 

targeting vector: pBAD-24 backbone Amp resistant gene 

Figure 8 

i: 

TGCCAAGCTTGACCCACTGTGGAAGTGTTCCAAAAAGCGGGAAGGCTCTTGAGCTACTTCACTAACAAC 
CGG 

o- 
»■ 

TCACCATCTTCGGGCCATTTGTAGACTGGAATATTTCGAGCTATGAGTGTGCTACTTCACTAACAACCG 
G 

h: 

TGGCCCCAGGGTGACGCGGACATGGAGTTGTCCiCCAr.GGCACTGGTCCATGAGAGTGCCAAGCTACTC 
GCGAC 

template: pKa2 

targeting vector: Hoxa-PI 
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Figure 9 

j' 

TAATAGCGAAGAGGCCCGCACCGATCGCCCTTCCCAACAGTTGCGCAGCCTGAATGGCGAATGGCGCT 

TTGCCTGGTTTATAACTTCGTATAGCATACATTATACGAAGTTATGGGCTGCTAAAGGAAGCGGAACA-- 
G 

k: 

TGGCAGTTCAGGCCAATCCGCGCCGGATGCGGTGTATCGCTCGCCACTTCAACATCAACGGTAATCGC^ 
A TTTG A CC A T A T A A CTTC G T A T A A TG T A TG C T A T A C G A A G TT A TCC C C A G A G TC C CG CTC A G A A G A ACT 
template: pJP5603 

targeting vector: JC°604 chromosome 

Figure 10 
i: *" 

TAGCTTGGCACTGGCCGTCGTTTTACAACGTCGTGACTGGGAAAACCCTGGCGTTACCCAACCCATCAC 

ATATACCTGCCGTTCACTAT 

m: 

TATCGGTGGCCGTGGTGTCGGCTCCGGCGCCTTCATACTGCACCGGGCGGGAAGGCGATTCCGAAGCCr 
AACCTTTCATAGAAGCC 
template: p!B279 
targeting vector: pS\'-paXl 

I*: GCTTGG'CACTGGCCGTCGTTTTACAACGTCGTGACTGGGAA 
m*: 

TCGGTGGCCGTGGTGTCGGCTCCGCCGCCTTCATACTGCACCGGGCGGGAAGGATCCACAGATTTGATC 
CAGCGATACAGC 

template: pSV-paz! 1 

targeting vector: pSV-sacB-neo 

Figure 1 1 

n: 

TACCGCATTAA AGCTTATCGATGATAAGCTGTCAAACATGAG AATTGACCCGGAACCCTTCTCGAGGAA 
GTTCCTATTCTCTAGAAAGTATAGGAACTTCCGAATAAATACCTGTGACGGAAGATCACTT 

P: 

TTCCCTCAAGAATTTTACTCTGTCAGAAACGGCCTTAACGACGTAGTCGAGGGACCTAGAAGTTCCTAT 
ACTTTCTAGAGAATAGGAACTTCATTATCACTTATTCAGGCGTAGCACCAGGCG 
template: pMAK705 
targeting vector: Hoxa-PI 

Figure 12 
left: 

TGAGACAATAACCCTGATAAATGCTTCAATAATATTGAAAAAGGAAGAGTATGGAGAAAAAAATCACT 

GGATATACCACCG 

right: 

TACAGGGCGCGTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACGCCCCGCCCTGC 

CACTCATCGCA 

template: pMAK705 

targeting vector: pBAD-24 backbone Amp resistant gene 
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Democratic People's 

Republic of Korea 

Republic of Korea 

Ka/aksran 

Saint Lucia 

Liechtenstein 

Sri Lanka 

Liberia 



LS 

I/I 

LU 

LV 

MC 

MD 

MG 

MK 

ML 

MN 

MR 

M\V 

MX 

NE 

NL 

NO 

NZ 

PL 

PT 

RO 

KU 

SI) 

Sh 

SC. 



Lesotho 

Lithuania 

Luxembourg 

Latvia 

Monaco 

Republic of Moldova 

Madagascar 

The former Yugoslav 

Republic of Macedonia 

Mali 

Mongolia 

Mauritania 

Malawi 

Mexico 

Niger 

Netherlands 
Norway 

New Zealand 
l'o!;ind 
Portugal 
Romania 

Russian Federation 

Sudan 

Sweden 

Singapore 



SI 


Slovenia 


SK 


Slovakia 


SN 


Senegal 


SZ 


Swaziland 


TD 


Chad 


TG 


Togo 


TJ 


Tajikistan 


TM 


Turkmenistan 


TR 


Turkey 


TT 


Trinidad and Tobauo 


UA 


Ukraine 


L'G 


Uganda 


US 


United States of America 


UZ 


Uzbekistan 


VN 


Viet Nam 


YU 


Yugoslavia 


zvv 


Zimbabwe 
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AMENDED CLAIMS 

[received by the International Bureau on 04 August 1999 (04.08.99); 
original claims 1-50 replaced by new claims 1-64 (10 pages)] 

1 . A method for cloning DNA molecules in procaryotic cells comprising 
the steps of: 

a) providing a procaryotic host cell capable of performing 
homologous recombination, 

b) contacting in said host oell a circular first DNA molecule 
which is capable of being replicated in said host cell with a 
second DNA molecule comprising at least two regions of 
sequence homology to regions on the first DNA molecule, 
under conditions which favour homologous recombination 
between said first and second DNA molecules and 

c) selecting a host cell in which homologous recombination 
between said first and second DNA molecules has occurred. 

2. The method according to claim 1 wherein the homologous 
recombination occurs via the recET cloning mechanism. 

3. The method according to claim 2 wherein the host cell is capable of 
expressing recE and recT genes. 

4. The method according to claim 3 wherein the recE and recT genes 
are selected from E.coli recE and recT genes or from A reda and redfl 
genes. 

5. The method according to claim 3 or 4 wherein the host cell is 
transformed with at least one vector capable of expressing recE 
and/or recT genes. 

6. The me. hod of claim 3, 4 or 5 wherein the expression of the recE 
and/or recT genes is under control of a regulatable promoter. 
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The method of claim 5 or 6 wherein the recT gene is overexpressed 
versus the recE gene. 

The method according to any one of claims 3 to 7 wherein the recE 
gene is selected from a nucleic acid molecule comprising 

(a) the nucleic acid sequence from position 1320 (ATG) to 2159 
(GAG) as depicted in Fig.7B, 

(b) the nucleic acid sequence from position 1320 (ATG) to 1998 
(CGA) as depicted in Fig.13B, 

(c) a nucleic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(d) a nucleic acid sequence which hybridizes under stringent 
conditions with the nucleic acid sequence from (a), (b) and/or (c). 

The method according to any one of claims 3 to 8 wherein the recT 
gene is selected from a nucleic acid molecule comprising 

(a) the nucleic acid sequence from position 2155 (ATG) to 2961 
(GAA) as depicted in Fig.7B, 

(b) the nucleic acid sequence from position 2086 (ATG) to 2868 
(GCA) as depicted in Fig.13B, 

(c) a nucleic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(d) a nucleic acid sequence which hybridizes under stringent 
conditions with the nucleic acid sequences from (a), (b) and/or (c). 

The method according to any one of the previous claims wherein the 
host cell is a gram-negative bacterial cell. 

The method according to claim 10 wherein the host cell is an 
Escherichia coli cell. 
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12. The method according to claim 11 wherein the host. cell is an 
Escherichia coii K12 strain. 

13. The method according to claim 12 wherein the E.coli strain is 
selected from JC 8679 and JC 9604. 

1 4. The method according to any one of the previous claims wherein the 
host cell further is capable of expressing a recBC inhibitor gene. 



15. The method according to claim 14 wherein the host cell is 
transformed with a vector expressing the recBC inhibitor gene. 

1 6. The method according to claim 1 4 or 15 wherein the recBC inhibitor 
gene is selected from a nucleic acid molecule comprising 

(a) the nucleic acid sequence from position 3588 (ATG) to 4002 
(GTA) as depicted in Fig.13B, 

(b) a nucleic acid encoding the same polypeptide within the 
degeneracy of the genetic code and/or 

(c) a nucleic acid sequence which hybridizes under stringent 
conditions (as defined above) with the nucleic acid sequence from (a) 
and/ or (b). 



17. The method according to any one of claims 1 3 to 16 wherein the 
host cell is a prokaryotic recBC+ cell. 

1 8. The method according to any one of the previous claims wherein the 
first DNA molecule is an extrachromosomal DNA molecule containing 
an origin of replication which is operative in the host cell. 

1 9. The method according to claim 1 8 wherein the first DNA molecule is 
selected from plasmids, cosmids, P1 vectors, BAC vectors and PAC 
vectors. 
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20. The method according to any one of claims 1-18 wherein the first 
DNA molecule is a host cell chromosome. 

21 . The method according to any one of the previous claims wherein the 
second DNA molecule is linear. 

22. The method according to any one of the previous claims wherein the 
regions of sequence homology are at least 15 nucleotides each. 

23. The method according to one of claims 1 to 1 6 wherein the second 
DNA molecule is obtained by an amplification reaction. 

24. The method according to one of the previous claims wherein the first 
and/or second DNA molecules are introduced into the host cells by 
transformation. 

25. The method according to claim 24 wherein the transformation 
method is electroporation. 

26. The method according to one of claims 1 to 25 wherein the first and 
second DNA molecules are introduced into the host cell 
simultaneously by co-transformation. 

27. The method according to one of claims 1 to 25 wherein the second 
DNA molecule is introduced into a host cell in which the first DNA 
molecule is already present. 

28. The method according to one of the previous claims wherein the 
second DNA molecule contains at least one marker gene placed 
between the two regions of s ^quence homology and wherein 
homologous recombination is detected by expression of said marker 
gene. 
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29. The method according to claim 28 wherein gene presence is selected 
from antibiotic resistance genes, deficiency complementation genes 
and reporter genes. 

30. The method of any one of claims 1 to 29 wherein the first DNA 
molecule contains at least one marker gene between the two regions 
of sequence homology and wherein homologous recombination is 
detected by lack of expression of said marker gene. 

31 . The method of any one of claims 1 to 30 wherein said marker gene 
is selected from genes which, under selected conditions, convey a 
toxic or bacteriostatic effect on the cell, and reporter genes. 

32. A method according to any one of the previous claims wherein the 
first DNA molecule contains at least one target site for a site specific 
recombinase between the two regions of sequence homology and 
wherein homologous recombination is detected by removal of said 
target site. 

33. A method for cloning DNA molecules comprising the steps of: 

(a) providing a source of RecE and RecT proteins, 

(b) contacting a first DNA molecule which is capable of being 
replicated in a suitable host cell with a second DNA molecule 
comprising at least two regions of sequence homology to regions on 
the first DNA molecule, under conditions which favour homologous 
recombination between said first and second DMA molecules and 

(c) selecting DNA molecules in which homologous recombination 
between said first and second DNA molecules has occurred. 

34. The method of claim 33 wherein said RecE and RecT or proteins are 
selected from E.coli RecE and RecT proteins or from phage A Reda 
and Redft proteins. 
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The method of claim 33 or 34 wherein the recombination occurs in 
vitro. 



The method of claim 33 or 34 wherein the recombination occurs in 
vivo. 



— -"- ,,,uft, "a a icuumuiiidin. unm moiecuie comprising 

introducing into a prokaryotic host cell a circular first DNA molecule 
which is capable of being replicated in said host cell, and introducing 
a second DNA molecule comprising a first and a second region of 
sequence homology to a third and fourth region, respectively, on the 
first DNA molecule, said host cell being capable of performing 
homologous recombination, such that a recombinant DNA molecule 
is made, said recombinant DNA molecule comprising the first DNA 
molecule wherein the sequences between said third and fourth 
regions have been replaced by sequences between the first and 
second regions of the second DNA molecule. 

The method • ccording to claim 37 which further comprises detecting 
the recombinant DNA molecule. 

A method for making a recombinant DNA molecule comprising 
introducing into a prokaryotic host cell, containing a chromosomal 
first DNA molecule, a second DNA molecule comprising a first and 
a second region of sequence homology to a third and a fourth region, 
respectively, on the host chromosomal first DNA molecule, said host 
cell being capable of performing homologous recombination, such 
that a recombinant DNA molecule is made, said recombinant DNA 
molecule comprising the chromosomal first DNA molecule wherein 
the sequences between said third and fourth regions have been 
replaced by sequences between the first and second regions of the 
second DNA molecule. 
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The method according to claim 39 which further comprises detecting 
the recombinant DNA molecule. 

The method according to any one of claims 37 to 40, wherein the 
host cell is capable of expressing RecE and RecT proteins or Aexo 
and A& proteins. 

A method for cloning DNA molecules comprising the steps of: 

(a) contacting in vitro a first DNA molecule with a second DNA 
molecule comprising at least two regions of sequence 
homology to regions on the first DNA molecule, in the 
presence of RecE and RecT proteins and under conditions 
which favour homologous recombination between said first 
and second DNA molecules; and 

(b) selecting a DNA molecule in which homologous recombination 
between said first and second DNA molecules has occurred. 

A method for making a recombinant DNA molecule comprising 
contacting in vitro a first DNA molecule with a second DNA molecule 
comprising a first and a second region of sequence homology to a 
third and a fourth region on the first DNA molecule, in the presence 
of RecE and RecT proteins and under conditions in which 
homologous recombination can occur, such that a recombinant DNA 
molecule is made, saTd recombinant DNA molecule comprising the 
first DNA molecule wherein the sequences between said third and 
fourth regions have been replaced by sequences between the first 
and second regions of the second DNA molecule. 

The method of claim 42, which further comprises between steps (a) 
and (b) the step of introducing the product step (a) into a cell, 
wherein recombination occurs in the cell. 
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